No description has been provided for this image

Project Nature & Content: Computer Vision

Motivation and Background Documentation

The primary intent of this assignment is to give you hands-on, practical experience with understanding the transition from simple (single hidden layer) to deep (multiple hidden layers) networks.

This hinges on understanding how hidden nodes learn to extract features from their inputs. When there are multiple hidden node layers, each successive layer extracts more generalized and abstract features.

When a hidden layer "learns" the kinds of features that are inherent in its input data, it is using a generative method. In this case, we're not telling it what those feature classes are; it has to figure them out on its own.

What we will pragmatically do is emulate how a hidden layer learns features by constructing "classes" of input data - where we think that the classes share similar features. We'll put input data into those classes that we THINK have similar features. Then, we conduct experiments to determine what the hidden nodes are actually learning.

You will have gathered and preprocessed your data, designed and refined your network structure, trained and tested the network, varied the hyperparameters to improve performance and analyzed/assessed the results.

The most important thing is not just to give a summary of classification rates/errors. I trust that you can get a working classifier, or can train a network to do any useful task.

You are welcome to use the CIFAR-10 data for this exercise. You are welcome to use Python with user-defined functions, Python with TensorFlow, and/or Python with Keras. For example, you can conduct the following experiments on the CIFAR-10 data. The goal is to compare DNN and CNN architectures. In all the experiments, you may hold some parameters constants - for example, the batch size to 100, the number of epochs to 20, same optimizer, same loss function of cross entropy, so that the comparisons are fair.

Experiment 1: DNN with 2 layers (no regularization)

Experiment 2: DNN with 3 layers (no regularization)

Experiment 3: CNN with 2 convolution/max pooling layers (no regularization)

Experiment 4: CNN with 3 convolution/max pooling layers (no regularization)

Experiment 5+ : You will conduct several more experiments. (a) Redo all the 4 experiments with some regularization technique. (b) Create more experiments on your own by tweaking architectures and/or hyper parameters.

Result1: Create a table with the accuracy and loss for train/test/validation & process time for ALL the models.

Result2: Take Experiment 3 – Extract the outputs from 2 filters from the 2 max pooling layers and visualize them in a grid as images. See whether the ‘lighted’ up regions correspond to some features in the original images.

Import packages needed¶

In [4]:
import numpy as np
import time
import pandas as pd
from packaging import version
from collections import Counter
import random

from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error as MSE
from sklearn.model_selection import train_test_split
from sklearn.manifold import TSNE

import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import models, layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, BatchNormalization, Dropout, Flatten, Dense
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.preprocessing import image
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.backend as k
# from tensorflow.keras.optimizers.legacy import Adam

from tensorflow.python.client import device_lib
import warnings
warnings.filterwarnings('ignore')
In [5]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:80% !important; }</style>"))

Verify TensorFlow Version and Keras Version¶

In [6]:
print("This notebook requires TensorFlow 2.0 or above")
print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >=2
This notebook requires TensorFlow 2.0 or above
TensorFlow version:  2.15.0
In [7]:
seed_val = 43

# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(seed_val)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
random.seed(seed_val)

# The below set_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(seed_val)

EDA Functions¶

In [8]:
def get_three_classes(x, y):
    def indices_of(class_id):
        indices, _ = np.where(y == float(class_id))
        return indices

    indices = np.concatenate([indices_of(0), indices_of(1), indices_of(2)], axis=0)

    x = x[indices]
    y = y[indices]

    count = x.shape[0]
    indices = np.random.choice(range(count), count, replace=False)

    x = x[indices]
    y = y[indices]

    y = tf.keras.utils.to_categorical(y)

    return x, y
In [9]:
def show_random_examples(x, y, p):
    indices = np.random.choice(range(x.shape[0]), 10, replace=False)

    x = x[indices]
    y = y[indices]
    p = p[indices]

    plt.figure(figsize=(10, 5))
    for i in range(10):
        plt.subplot(2, 5, i + 1)
        plt.imshow(x[i])
        plt.xticks([])
        plt.yticks([])
        col = 'green' if np.argmax(y[i]) == np.argmax(p[i]) else 'red'
        plt.xlabel(class_names_preview[np.argmax(p[i])], color=col)
    plt.show()

Research Assignment Reporting Functions¶

In [10]:
def plot_history(history):
  losses = history.history['loss']
  accs = history.history['accuracy']
  val_losses = history.history['val_loss']
  val_accs = history.history['val_accuracy']
  epochs = len(losses)

  plt.figure(figsize=(16, 4))
  for i, metrics in enumerate(zip([losses, accs], [val_losses, val_accs], ['Loss', 'Accuracy'])):
    plt.subplot(1, 2, i + 1)
    plt.plot(range(epochs), metrics[0], label='Training {}'.format(metrics[2]))
    plt.plot(range(epochs), metrics[1], label='Validation {}'.format(metrics[2]))
    plt.legend()
  plt.show()

def display_training_curves(training, validation, title, subplot):
  ax = plt.subplot(subplot)
  ax.plot(training)
  ax.plot(validation)
  ax.set_title('model '+ title)
  ax.set_ylabel(title)
  ax.set_xlabel('epoch')
  ax.legend(['training', 'validation'])

def print_validation_report(y_test, predictions):
    print("Classification Report")
    print(classification_report(y_test, predictions))
    print('Accuracy Score: {}'.format(accuracy_score(y_test, predictions)))
    print('Root Mean Square Error: {}'.format(np.sqrt(MSE(y_test, predictions))))

def plot_confusion_matrix(y_true, y_pred):
    mtx = confusion_matrix(y_true, y_pred)
    fig, ax = plt.subplots(figsize=(16,12))
    sns.heatmap(mtx, annot=True, fmt='d', linewidths=.75,  cbar=False, ax=ax,cmap='Blues',linecolor='white')
    #  square=True,
    plt.ylabel('true label')
    plt.xlabel('predicted label')

Loading cifar10 Dataset¶

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

In [11]:
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 [==============================] - 2s 0us/step

EDA¶

In [12]:
print('train_images:\t{}'.format(x_train.shape))
print('train_labels:\t{}'.format(y_train.shape))
print('test_images:\t{}'.format(x_test.shape))
print('test_labels:\t{}'.format(y_test.shape))
train_images:	(50000, 32, 32, 3)
train_labels:	(50000, 1)
test_images:	(10000, 32, 32, 3)
test_labels:	(10000, 1)
In [13]:
print("First ten labels training dataset:\n {}\n".format(y_train[0:10]))
print("This output the numeric label, need to convert to item description")
First ten labels training dataset:
 [[6]
 [9]
 [9]
 [4]
 [1]
 [1]
 [2]
 [7]
 [8]
 [3]]

This output the numeric label, need to convert to item description
In [14]:
(train_images, train_labels),(test_images, test_labels)= tf.keras.datasets.cifar10.load_data()
In [15]:
x_preview, y_preview = get_three_classes(train_images, train_labels)
x_preview, y_preview = get_three_classes(test_images, test_labels)
In [16]:
class_names_preview = ['aeroplane', 'car', 'bird']

show_random_examples(x_preview, y_preview, y_preview)
No description has been provided for this image
In [17]:
plt.figure(figsize = (12 ,8))
items = [{'Class': x, 'Count': y} for x, y in Counter(train_labels.ravel()).items()]
distribution = pd.DataFrame(items).sort_values(['Class'])
sns.barplot(x=distribution.Class, y=distribution.Count);
No description has been provided for this image
In [18]:
class_names = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck']

Create Validation Data Set¶

In [19]:
x_train_split, x_valid_split, y_train_split, y_valid_split = train_test_split(x_train
                                                                              ,y_train
                                                                              ,test_size=.1
                                                                              ,random_state=seed_val
                                                                              ,shuffle=True)

Confirm Datasets {Train, Validation, Test}¶

In [20]:
print("Training\t", x_train_split.shape,
      "\nValidation\t", x_valid_split.shape,
      "\nTest\t\t", x_test.shape)
Training	 (45000, 32, 32, 3) 
Validation	 (5000, 32, 32, 3) 
Test		 (10000, 32, 32, 3)

Rescale Examples {Train, Validation, Test}¶

The images are 28x28 NumPy arrays, with pixel values ranging from 0 to 255

  1. Each element in each example is a pixel value
  2. Pixel values range from 0 to 255
  3. 0 = black
  4. 255 = white
In [21]:
x_train_norm = x_train_split/255
x_valid_norm = x_valid_split/255
x_test_norm = x_test/255

Create the Model¶

No description has been provided for this image
In [22]:
results = {}

Experiment 1:¶

  • DNN with 2 layers
  • no regularization
Build CNN Model¶
In [23]:
k.clear_session()
model_01 = Sequential([
  Flatten(input_shape=x_train_norm.shape[1:]),
  Dense(units=384,activation=tf.nn.relu),
  Dense(units=768,activation=tf.nn.relu),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment1"] = {}
results["Experiment1"]["Architecture"] = "• DNN with 2 layers\n • no regularization"
2024-10-20 03:42:39.377699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20974 MB memory:  -> device: 0, name: NVIDIA L4, pci bus id: 0000:35:00.0, compute capability: 8.9
In [24]:
model_01.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 3072)              0         
                                                                 
 dense (Dense)               (None, 384)               1180032   
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 dense_2 (Dense)             (None, 10)                7690      
                                                                 
=================================================================
Total params: 1483402 (5.66 MB)
Trainable params: 1483402 (5.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [25]:
keras.utils.plot_model(model_01, "CIFAR10_EXP_01.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [26]:
model_01.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [27]:
# Start time
start_time = time.time()

history_01 = model_01.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_01_2DNN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment1"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:42:41.986097: I external/local_xla/xla/service/service.cc:168] XLA service 0x7fdc641c7bb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-10-20 03:42:41.986130: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA L4, Compute Capability 8.9
2024-10-20 03:42:42.004663: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-10-20 03:42:42.038574: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8902
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1729395762.154273    4048 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
704/704 [==============================] - 3s 2ms/step - loss: 1.8620 - accuracy: 0.3257 - val_loss: 1.7651 - val_accuracy: 0.3610
Epoch 2/200
704/704 [==============================] - 1s 2ms/step - loss: 1.6743 - accuracy: 0.3978 - val_loss: 1.6542 - val_accuracy: 0.4146
Epoch 3/200
704/704 [==============================] - 1s 2ms/step - loss: 1.5933 - accuracy: 0.4304 - val_loss: 1.6508 - val_accuracy: 0.4144
Epoch 4/200
704/704 [==============================] - 1s 2ms/step - loss: 1.5277 - accuracy: 0.4515 - val_loss: 1.5871 - val_accuracy: 0.4398
Epoch 5/200
704/704 [==============================] - 1s 2ms/step - loss: 1.4943 - accuracy: 0.4641 - val_loss: 1.5559 - val_accuracy: 0.4500
Epoch 6/200
704/704 [==============================] - 1s 2ms/step - loss: 1.4663 - accuracy: 0.4741 - val_loss: 1.5530 - val_accuracy: 0.4480
Epoch 7/200
704/704 [==============================] - 1s 2ms/step - loss: 1.4398 - accuracy: 0.4827 - val_loss: 1.5217 - val_accuracy: 0.4706
Epoch 8/200
704/704 [==============================] - 1s 2ms/step - loss: 1.4171 - accuracy: 0.4948 - val_loss: 1.5630 - val_accuracy: 0.4478
Epoch 9/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3965 - accuracy: 0.4995 - val_loss: 1.5715 - val_accuracy: 0.4530
Epoch 10/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3756 - accuracy: 0.5081 - val_loss: 1.5471 - val_accuracy: 0.4564
Epoch 11/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3568 - accuracy: 0.5138 - val_loss: 1.5371 - val_accuracy: 0.4624
Epoch 12/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3382 - accuracy: 0.5185 - val_loss: 1.5280 - val_accuracy: 0.4652
Epoch 13/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3255 - accuracy: 0.5238 - val_loss: 1.5387 - val_accuracy: 0.4616
Epoch 14/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3111 - accuracy: 0.5298 - val_loss: 1.5503 - val_accuracy: 0.4612
Time taken to train Model: 21.78 seconds
In [28]:
train_loss = history_01.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_01.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_01.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_01.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_01 = tf.keras.models.load_model("A2_Exp_01_2DNN.h5")
test_loss, test_accuracy = model_01.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment1"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment1"]["Test Loss"] = round(test_loss,3)
results["Experiment1"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment1"]["Train Loss"] = round(train_loss,3)
results["Experiment1"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment1"]["Validation Loss"] = round(val_loss,3)
Training Loss: 1.311, Training Accuracy: 0.530
Validation Loss: 1.550, Validation Accuracy: 0.461
Test Loss: 1.478, Test Accuracy: 0.474
In [29]:
pred01 = model_01.predict(x_test_norm)
print('shape of preds: ', pred01.shape)

history_01_dict = history_01.history
history_01_dict.keys()
313/313 [==============================] - 1s 911us/step
shape of preds:  (10000, 10)
Out[29]:
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
In [30]:
history__01_df=pd.DataFrame(history_01_dict)
history__01_df.tail().round(3)
Out[30]:
loss accuracy val_loss val_accuracy
9 1.376 0.508 1.547 0.456
10 1.357 0.514 1.537 0.462
11 1.338 0.518 1.528 0.465
12 1.325 0.524 1.539 0.462
13 1.311 0.530 1.550 0.461

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [31]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_01.history['accuracy'], history_01.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_01.history['loss'], history_01.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [32]:
pred01_cm=np.argmax(pred01, axis=1)
print_validation_report(y_test, pred01_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.56      0.47      0.51      1000
           1       0.64      0.51      0.57      1000
           2       0.38      0.27      0.32      1000
           3       0.34      0.30      0.32      1000
           4       0.42      0.34      0.38      1000
           5       0.40      0.33      0.36      1000
           6       0.44      0.65      0.53      1000
           7       0.49      0.57      0.53      1000
           8       0.52      0.73      0.61      1000
           9       0.52      0.57      0.54      1000

    accuracy                           0.47     10000
   macro avg       0.47      0.47      0.47     10000
weighted avg       0.47      0.47      0.47     10000

Accuracy Score: 0.4743
Root Mean Square Error: 3.190235101054466
In [33]:
plot_confusion_matrix(y_test,pred01_cm)
No description has been provided for this image
In [34]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred01[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[34]:
  airplane automobile bird cat deer dog frog horse ship truck
0 2.77% 27.15% 4.39% 34.81% 3.44% 12.96% 1.63% 1.93% 7.93% 2.99%
1 12.46% 10.69% 0.89% 0.03% 0.16% 0.03% 0.06% 0.17% 55.51% 20.00%
2 22.51% 4.83% 0.11% 0.07% 0.06% 0.06% 0.01% 0.32% 63.13% 8.92%
3 47.36% 5.11% 1.19% 0.26% 1.15% 0.60% 0.04% 4.09% 37.21% 2.98%
4 0.14% 0.32% 12.79% 2.22% 48.58% 1.60% 33.94% 0.22% 0.15% 0.04%
5 0.64% 1.33% 10.41% 27.64% 3.89% 4.59% 45.16% 5.36% 0.17% 0.83%
6 29.74% 30.35% 2.03% 19.15% 0.52% 11.03% 1.32% 4.34% 1.28% 0.25%
7 0.53% 0.44% 33.00% 7.02% 10.17% 2.22% 45.86% 0.30% 0.09% 0.36%
8 8.12% 1.57% 28.15% 7.73% 27.35% 8.56% 3.97% 13.19% 0.99% 0.37%
9 0.93% 66.77% 1.43% 1.65% 0.35% 0.55% 0.01% 0.20% 5.73% 22.37%
10 24.96% 0.38% 5.37% 7.07% 4.28% 11.15% 4.48% 0.40% 41.66% 0.25%
11 0.12% 24.82% 0.14% 0.65% 0.07% 0.25% 0.14% 0.11% 13.47% 60.23%
12 4.03% 11.27% 13.19% 14.34% 15.02% 12.00% 24.42% 3.10% 1.46% 1.18%
13 17.55% 0.61% 1.98% 0.12% 0.24% 1.83% 0.12% 76.64% 0.42% 0.49%
14 4.44% 53.18% 5.58% 0.79% 0.18% 1.24% 0.08% 1.86% 0.67% 31.98%
15 12.92% 1.14% 1.75% 3.11% 1.72% 13.57% 0.94% 0.92% 60.53% 3.40%
16 0.19% 7.07% 1.14% 44.20% 0.08% 17.42% 0.17% 20.49% 0.92% 8.32%
17 23.75% 1.56% 10.05% 4.80% 25.29% 3.16% 3.69% 9.88% 10.61% 7.21%
18 3.22% 2.97% 0.08% 0.03% 0.25% 0.01% 0.01% 0.18% 92.46% 0.79%
19 0.24% 2.92% 6.11% 3.29% 1.78% 7.92% 65.50% 9.72% 0.09% 2.43%
In [35]:
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_01.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_01 = tf.keras.models.Model(inputs=model_01.input, outputs=layer_outputs)

# Get activation values for the last dense layer
activations_01 = activation_model_01.predict(x_valid_norm[:2000])
dense_layer_activations_01 = activations_01[-3]
output_layer_activations_01 = activations_01[-1]
63/63 [==============================] - 0s 873us/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [36]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_01 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_01 = tsne_01.fit_transform(dense_layer_activations_01)

# Scaling
tsne_results_01 = (tsne_results_01 - tsne_results_01.min()) / (tsne_results_01.max() - tsne_results_01.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 2000 samples in 0.000s...
[t-SNE] Computed neighbors for 2000 samples in 0.109s...
[t-SNE] Computed conditional probabilities for sample 1000 / 2000
[t-SNE] Computed conditional probabilities for sample 2000 / 2000
[t-SNE] Mean sigma: 4.205060
[t-SNE] KL divergence after 250 iterations with early exaggeration: 73.325455
[t-SNE] KL divergence after 300 iterations: 2.298152
In [37]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_01[:,0],tsne_results_01[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_01[:,0],tsne_results_01[:,1], c=y_valid_split[:2000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_01):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 2¶

  • DNN with 3 layers
  • no regularization
Build CNN Model¶
In [38]:
k.clear_session()
model_02 = Sequential([
  Flatten(input_shape=x_train_norm.shape[1:]),
  Dense(units=384,activation=tf.nn.relu),
  Dense(units=768,activation=tf.nn.relu),
  Dense(units=1536,activation=tf.nn.relu),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment2"] = {}
results["Experiment2"]["Architecture"] = "• DNN with 3 layers\n • no regularization"
In [39]:
model_02.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 3072)              0         
                                                                 
 dense (Dense)               (None, 384)               1180032   
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 dense_2 (Dense)             (None, 1536)              1181184   
                                                                 
 dense_3 (Dense)             (None, 10)                15370     
                                                                 
=================================================================
Total params: 2672266 (10.19 MB)
Trainable params: 2672266 (10.19 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [40]:
keras.utils.plot_model(model_02, "CIFAR10_EXP_02.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [41]:
model_02.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [42]:
# Start time
start_time = time.time()
history_02 = model_02.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_02_3DNN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment2"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 3s 2ms/step - loss: 1.8682 - accuracy: 0.3207 - val_loss: 1.8131 - val_accuracy: 0.3428
Epoch 2/200
704/704 [==============================] - 2s 2ms/step - loss: 1.6875 - accuracy: 0.3922 - val_loss: 1.6631 - val_accuracy: 0.4002
Epoch 3/200
704/704 [==============================] - 2s 2ms/step - loss: 1.6053 - accuracy: 0.4210 - val_loss: 1.6479 - val_accuracy: 0.4274
Epoch 4/200
704/704 [==============================] - 2s 2ms/step - loss: 1.5433 - accuracy: 0.4454 - val_loss: 1.6016 - val_accuracy: 0.4276
Epoch 5/200
704/704 [==============================] - 2s 2ms/step - loss: 1.5096 - accuracy: 0.4591 - val_loss: 1.5565 - val_accuracy: 0.4464
Epoch 6/200
704/704 [==============================] - 2s 2ms/step - loss: 1.4642 - accuracy: 0.4721 - val_loss: 1.5056 - val_accuracy: 0.4610
Epoch 7/200
704/704 [==============================] - 1s 2ms/step - loss: 1.4304 - accuracy: 0.4871 - val_loss: 1.5461 - val_accuracy: 0.4504
Epoch 8/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3962 - accuracy: 0.4967 - val_loss: 1.5182 - val_accuracy: 0.4662
Epoch 9/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3642 - accuracy: 0.5113 - val_loss: 1.5384 - val_accuracy: 0.4670
Epoch 10/200
704/704 [==============================] - 1s 2ms/step - loss: 1.3334 - accuracy: 0.5178 - val_loss: 1.5186 - val_accuracy: 0.4674
Epoch 11/200
704/704 [==============================] - 1s 2ms/step - loss: 1.2867 - accuracy: 0.5344 - val_loss: 1.5450 - val_accuracy: 0.4644
Epoch 12/200
704/704 [==============================] - 1s 2ms/step - loss: 1.2536 - accuracy: 0.5486 - val_loss: 1.5577 - val_accuracy: 0.4644
Epoch 13/200
704/704 [==============================] - 1s 2ms/step - loss: 1.2116 - accuracy: 0.5608 - val_loss: 1.5392 - val_accuracy: 0.4816
Epoch 14/200
704/704 [==============================] - 1s 2ms/step - loss: 1.1612 - accuracy: 0.5789 - val_loss: 1.6585 - val_accuracy: 0.4562
Epoch 15/200
704/704 [==============================] - 1s 2ms/step - loss: 1.1117 - accuracy: 0.5966 - val_loss: 1.6516 - val_accuracy: 0.4562
Epoch 16/200
704/704 [==============================] - 1s 2ms/step - loss: 1.0550 - accuracy: 0.6170 - val_loss: 1.7009 - val_accuracy: 0.4562
Epoch 17/200
704/704 [==============================] - 1s 2ms/step - loss: 1.0008 - accuracy: 0.6342 - val_loss: 1.7092 - val_accuracy: 0.4636
Epoch 18/200
704/704 [==============================] - 1s 2ms/step - loss: 0.9376 - accuracy: 0.6580 - val_loss: 1.7983 - val_accuracy: 0.4670
Epoch 19/200
704/704 [==============================] - 1s 2ms/step - loss: 0.8947 - accuracy: 0.6726 - val_loss: 1.8620 - val_accuracy: 0.4704
Epoch 20/200
704/704 [==============================] - 1s 2ms/step - loss: 0.8237 - accuracy: 0.6989 - val_loss: 1.9702 - val_accuracy: 0.4608
Epoch 21/200
704/704 [==============================] - 1s 2ms/step - loss: 0.7661 - accuracy: 0.7220 - val_loss: 2.0270 - val_accuracy: 0.4550
Epoch 22/200
704/704 [==============================] - 1s 2ms/step - loss: 0.7078 - accuracy: 0.7425 - val_loss: 2.1814 - val_accuracy: 0.4558
Epoch 23/200
704/704 [==============================] - 1s 2ms/step - loss: 0.6547 - accuracy: 0.7624 - val_loss: 2.3066 - val_accuracy: 0.4538
Time taken to train Model: 36.06 seconds
In [43]:
train_loss = history_02.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_02.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_02.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_02.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_02 = tf.keras.models.load_model("A2_Exp_02_3DNN.h5")
test_loss, test_accuracy = model_02.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment2"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment2"]["Test Loss"] = round(test_loss,3)
results["Experiment2"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment2"]["Train Loss"] = round(train_loss,3)
results["Experiment2"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment2"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.655, Training Accuracy: 0.762
Validation Loss: 2.307, Validation Accuracy: 0.454
Test Loss: 1.466, Test Accuracy: 0.474
In [44]:
pred02 = model_02.predict(x_test_norm)
print('shape of preds: ', pred02.shape)

history_02_dict = history_02.history
history_02_df=pd.DataFrame(history_02_dict)
history_02_df.tail().round(3)
313/313 [==============================] - 0s 896us/step
shape of preds:  (10000, 10)
Out[44]:
loss accuracy val_loss val_accuracy
18 0.895 0.673 1.862 0.470
19 0.824 0.699 1.970 0.461
20 0.766 0.722 2.027 0.455
21 0.708 0.743 2.181 0.456
22 0.655 0.762 2.307 0.454

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [45]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_02.history['accuracy'], history_02.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_02.history['loss'], history_02.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [46]:
pred02_cm=np.argmax(pred02, axis=1)
print_validation_report(y_test, pred02_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.56      0.51      0.53      1000
           1       0.64      0.52      0.57      1000
           2       0.39      0.17      0.24      1000
           3       0.34      0.38      0.36      1000
           4       0.37      0.44      0.40      1000
           5       0.43      0.28      0.34      1000
           6       0.46      0.62      0.53      1000
           7       0.51      0.56      0.53      1000
           8       0.52      0.69      0.60      1000
           9       0.51      0.57      0.54      1000

    accuracy                           0.47     10000
   macro avg       0.47      0.47      0.46     10000
weighted avg       0.47      0.47      0.46     10000

Accuracy Score: 0.4742
Root Mean Square Error: 3.16234090508914
In [47]:
plot_confusion_matrix(y_test,pred01_cm)
No description has been provided for this image
In [48]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred02[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[48]:
  airplane automobile bird cat deer dog frog horse ship truck
0 3.55% 0.59% 9.31% 45.32% 4.36% 23.38% 8.19% 0.96% 3.53% 0.80%
1 6.87% 14.83% 0.11% 0.11% 0.06% 0.03% 0.01% 0.16% 25.24% 52.59%
2 19.74% 13.84% 0.10% 0.05% 0.07% 0.02% 0.00% 0.18% 47.81% 18.18%
3 20.62% 5.98% 1.12% 0.57% 1.31% 0.25% 0.04% 1.52% 59.66% 8.93%
4 0.22% 0.08% 5.03% 3.49% 47.37% 2.18% 40.56% 0.78% 0.20% 0.09%
5 2.10% 1.30% 7.12% 22.54% 18.84% 7.46% 36.99% 1.84% 0.62% 1.20%
6 5.30% 3.61% 4.39% 59.53% 0.10% 19.62% 4.91% 1.38% 0.78% 0.39%
7 1.00% 1.29% 20.43% 18.71% 15.40% 13.58% 25.20% 1.65% 0.97% 1.78%
8 4.20% 0.66% 18.22% 13.27% 33.68% 12.31% 1.87% 9.95% 4.71% 1.12%
9 2.89% 59.93% 1.21% 3.97% 0.32% 1.14% 0.61% 0.16% 8.96% 20.81%
10 20.15% 0.14% 4.32% 11.88% 0.82% 16.82% 4.85% 1.26% 39.25% 0.51%
11 0.51% 13.50% 0.07% 0.17% 0.02% 0.05% 0.02% 0.04% 1.17% 84.46%
12 2.40% 4.37% 7.44% 26.80% 7.52% 22.26% 12.52% 11.25% 2.59% 2.83%
13 13.77% 8.77% 1.72% 0.64% 1.11% 5.85% 0.54% 63.98% 2.24% 1.39%
14 4.14% 41.69% 4.13% 1.01% 0.08% 2.09% 0.18% 1.43% 1.63% 43.60%
15 7.46% 1.46% 3.54% 6.94% 4.71% 8.41% 5.57% 2.25% 55.42% 4.24%
16 0.54% 0.10% 4.07% 38.52% 2.34% 42.45% 3.16% 6.47% 1.70% 0.65%
17 10.43% 8.59% 5.40% 11.43% 20.51% 8.51% 6.56% 10.06% 9.17% 9.35%
18 3.20% 1.07% 0.01% 0.01% 0.06% 0.00% 0.00% 0.01% 94.48% 1.15%
19 0.11% 0.08% 3.00% 3.09% 20.33% 3.37% 62.85% 6.95% 0.02% 0.20%
In [49]:
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_02.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_02 = tf.keras.models.Model(inputs=model_02.input, outputs=layer_outputs)

# Get activation values for the last dense layer
# activations_02 = activation_model_02.predict(x_valid_norm[:3250])
activations_02 = activation_model_02.predict(x_valid_norm[:2000])
dense_layer_activations_02 = activations_02[-3]
output_layer_activations_02 = activations_02[-1]
63/63 [==============================] - 0s 917us/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [50]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_02 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_02 = tsne_02.fit_transform(dense_layer_activations_02)

# Scaling
tsne_results_02 = (tsne_results_02 - tsne_results_02.min()) / (tsne_results_02.max() - tsne_results_02.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 2000 samples in 0.001s...
[t-SNE] Computed neighbors for 2000 samples in 0.062s...
[t-SNE] Computed conditional probabilities for sample 1000 / 2000
[t-SNE] Computed conditional probabilities for sample 2000 / 2000
[t-SNE] Mean sigma: 1.543442
[t-SNE] KL divergence after 250 iterations with early exaggeration: 71.850433
[t-SNE] KL divergence after 300 iterations: 2.353606
In [51]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_02[:,0],tsne_results_02[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_02[:,0],tsne_results_02[:,1], c=y_valid_split[:2000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_02):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 3¶

  • CNN with 2 layers/max pooling layers
  • 1 full-connected layer
  • no regularization
Build CNN Model¶
In [52]:
k.clear_session()
model_03 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment3"] = {}
results["Experiment3"]["Architecture"] = "• CNN with 2 layers/max pooling layers\n • 1 full-connected layer\n • no regularization"
In [53]:
model_03.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 9216)              0         
                                                                 
 dense (Dense)               (None, 384)               3539328   
                                                                 
 dense_1 (Dense)             (None, 10)                3850      
                                                                 
=================================================================
Total params: 3841930 (14.66 MB)
Trainable params: 3841930 (14.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [54]:
keras.utils.plot_model(model_03, "CIFAR10_EXP_03.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [55]:
model_03.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [56]:
# Start time
start_time = time.time()

history_03 = model_03.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_03_2CNN_2DNN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment3"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 5s 5ms/step - loss: 1.4488 - accuracy: 0.4804 - val_loss: 1.3456 - val_accuracy: 0.5360
Epoch 2/200
704/704 [==============================] - 3s 4ms/step - loss: 1.0759 - accuracy: 0.6222 - val_loss: 1.0515 - val_accuracy: 0.6150
Epoch 3/200
704/704 [==============================] - 3s 4ms/step - loss: 0.9241 - accuracy: 0.6772 - val_loss: 1.0442 - val_accuracy: 0.6276
Epoch 4/200
704/704 [==============================] - 3s 4ms/step - loss: 0.8107 - accuracy: 0.7172 - val_loss: 0.9475 - val_accuracy: 0.6708
Epoch 5/200
704/704 [==============================] - 3s 4ms/step - loss: 0.7048 - accuracy: 0.7544 - val_loss: 0.9167 - val_accuracy: 0.6832
Epoch 6/200
704/704 [==============================] - 3s 4ms/step - loss: 0.6097 - accuracy: 0.7870 - val_loss: 0.8989 - val_accuracy: 0.6956
Epoch 7/200
704/704 [==============================] - 3s 4ms/step - loss: 0.5161 - accuracy: 0.8210 - val_loss: 0.9226 - val_accuracy: 0.7056
Epoch 8/200
704/704 [==============================] - 3s 4ms/step - loss: 0.4323 - accuracy: 0.8480 - val_loss: 0.9711 - val_accuracy: 0.6940
Epoch 9/200
704/704 [==============================] - 3s 4ms/step - loss: 0.3497 - accuracy: 0.8782 - val_loss: 0.9961 - val_accuracy: 0.7092
Epoch 10/200
704/704 [==============================] - 3s 4ms/step - loss: 0.2733 - accuracy: 0.9049 - val_loss: 1.1377 - val_accuracy: 0.6926
Epoch 11/200
704/704 [==============================] - 3s 4ms/step - loss: 0.2138 - accuracy: 0.9266 - val_loss: 1.2198 - val_accuracy: 0.6988
Epoch 12/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1752 - accuracy: 0.9385 - val_loss: 1.3672 - val_accuracy: 0.7056
Epoch 13/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1419 - accuracy: 0.9511 - val_loss: 1.5439 - val_accuracy: 0.6946
Epoch 14/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1142 - accuracy: 0.9616 - val_loss: 1.6647 - val_accuracy: 0.6838
Epoch 15/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1076 - accuracy: 0.9620 - val_loss: 1.7791 - val_accuracy: 0.6930
Epoch 16/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0886 - accuracy: 0.9692 - val_loss: 1.7916 - val_accuracy: 0.6912
Epoch 17/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0858 - accuracy: 0.9700 - val_loss: 1.8698 - val_accuracy: 0.6914
Epoch 18/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0740 - accuracy: 0.9744 - val_loss: 1.9822 - val_accuracy: 0.6862
Epoch 19/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0853 - accuracy: 0.9701 - val_loss: 2.0255 - val_accuracy: 0.6850
Time taken to train Model: 60.82 seconds
In [57]:
train_loss = history_03.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_03.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_03.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_03.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")


model_03 = tf.keras.models.load_model("A2_Exp_03_2CNN_2DNN.h5")
test_loss, test_accuracy = model_03.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment3"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment3"]["Test Loss"] = round(test_loss,3)
results["Experiment3"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment3"]["Train Loss"] = round(train_loss,3)
results["Experiment3"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment3"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.085, Training Accuracy: 0.970
Validation Loss: 2.025, Validation Accuracy: 0.685
Test Loss: 0.865, Test Accuracy: 0.713
In [58]:
pred03 = model_03.predict(x_test_norm)
print('shape of preds: ', pred03.shape)
history_03_dict = history_03.history
history_03_df=pd.DataFrame(history_03_dict)
history_03_df.tail().round(3)
313/313 [==============================] - 0s 952us/step
shape of preds:  (10000, 10)
Out[58]:
loss accuracy val_loss val_accuracy
14 0.108 0.962 1.779 0.693
15 0.089 0.969 1.792 0.691
16 0.086 0.970 1.870 0.691
17 0.074 0.974 1.982 0.686
18 0.085 0.970 2.025 0.685

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [59]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_03.history['accuracy'], history_03.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_03.history['loss'], history_03.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [60]:
pred03_cm=np.argmax(pred03, axis=1)
print_validation_report(y_test, pred03_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.74      0.79      0.77      1000
           1       0.77      0.87      0.82      1000
           2       0.64      0.56      0.60      1000
           3       0.56      0.54      0.55      1000
           4       0.65      0.67      0.66      1000
           5       0.63      0.58      0.61      1000
           6       0.66      0.88      0.75      1000
           7       0.81      0.70      0.75      1000
           8       0.85      0.80      0.82      1000
           9       0.83      0.75      0.79      1000

    accuracy                           0.71     10000
   macro avg       0.71      0.71      0.71     10000
weighted avg       0.71      0.71      0.71     10000

Accuracy Score: 0.7128
Root Mean Square Error: 2.2347930552961723
In [61]:
plot_confusion_matrix(y_test,pred03_cm)
No description has been provided for this image
In [62]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred03[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[62]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.16% 0.15% 0.55% 83.54% 0.11% 11.89% 3.19% 0.01% 0.38% 0.01%
1 0.16% 11.03% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 88.78% 0.03%
2 2.38% 10.27% 0.04% 0.33% 0.45% 0.08% 0.03% 0.21% 83.92% 2.28%
3 55.12% 3.38% 6.45% 0.80% 9.03% 0.01% 0.12% 0.19% 24.88% 0.02%
4 0.00% 0.00% 0.37% 2.77% 59.21% 0.20% 37.45% 0.00% 0.00% 0.00%
5 0.02% 0.01% 2.15% 1.03% 1.19% 4.60% 90.99% 0.01% 0.00% 0.01%
6 0.11% 91.51% 0.00% 0.12% 0.00% 2.78% 0.04% 0.00% 0.00% 5.43%
7 3.91% 0.23% 44.25% 2.92% 4.65% 1.07% 42.18% 0.24% 0.23% 0.33%
8 0.23% 0.04% 22.25% 54.10% 4.97% 11.04% 4.12% 3.21% 0.01% 0.03%
9 0.60% 98.57% 0.01% 0.00% 0.01% 0.00% 0.02% 0.00% 0.04% 0.75%
10 78.58% 0.15% 0.84% 1.81% 17.57% 0.45% 0.08% 0.18% 0.21% 0.14%
11 0.00% 0.16% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.84%
12 0.11% 1.77% 13.25% 10.30% 3.56% 56.92% 9.99% 0.22% 3.85% 0.02%
13 0.02% 0.00% 0.00% 0.00% 0.02% 0.02% 0.00% 99.94% 0.00% 0.00%
14 0.01% 6.16% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.14% 93.69%
15 0.82% 0.47% 0.05% 0.46% 0.09% 0.14% 55.77% 0.00% 42.21% 0.00%
16 0.00% 0.08% 0.06% 6.27% 0.00% 92.59% 0.02% 0.96% 0.00% 0.00%
17 0.77% 0.05% 1.49% 24.14% 1.27% 28.00% 1.74% 40.95% 1.15% 0.44%
18 0.08% 0.95% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 98.71% 0.25%
19 0.00% 0.01% 0.08% 0.17% 0.03% 0.03% 99.67% 0.00% 0.00% 0.00%
In [63]:
layer_names = []
for layer in model_03.layers:
    layer_names.append(layer.name)

print(layer_names)

# Extracts the outputs of the top 8 layers:
layer_outputs_03 = [layer.output for layer in model_03.layers[:7]]
# Creates a model that will return these outputs, given the model input:
activation_model_03 = tf.keras.models.Model(inputs=model_03.input, outputs=layer_outputs_03)

# Get activation values for the last dense layer
# activations_03 = activation_model_03.predict(x_valid_norm[:3250])
activations_03 = activation_model_03.predict(x_valid_norm[:1000])
dense_layer_activations_03 = activations_03[-3]
output_layer_activations_03 = activations_03[-1]
['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
32/32 [==============================] - 0s 2ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [64]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_03 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_03 = tsne_03.fit_transform(dense_layer_activations_03)

# Scaling
tsne_results_03 = (tsne_results_03 - tsne_results_03.min()) / (tsne_results_03.max() - tsne_results_03.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1000 samples in 0.002s...
[t-SNE] Computed neighbors for 1000 samples in 0.245s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1000
[t-SNE] Mean sigma: 2.503065
[t-SNE] KL divergence after 250 iterations with early exaggeration: 67.902954
[t-SNE] KL divergence after 300 iterations: 1.957590
In [65]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_03[:,0],tsne_results_03[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_03[:,0],tsne_results_03[:,1], c=y_valid_split[:1000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_03):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image

Result2:¶

Take Experiment 3 – Extract the outputs from 2 filters from the 2 max pooling layers and visualize them in a grid as images. See whether the ‘lighted’ up regions correspond to some features in the original images.

In [66]:
(_,_), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()

img = test_images[2004]
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis=0)

class_names = ['airplane'
,'automobile'
,'bird'
,'cat'
,'deer'
,'dog'
,'frog'
,'horse'
,'ship'
,'truck']

plt.imshow(img, cmap='viridis')
plt.axis('off')
plt.show()
No description has been provided for this image
In [67]:
activations_cnn_03 = activation_model_03.predict(img_tensor)
len(activations_cnn_03)
1/1 [==============================] - 0s 67ms/step
Out[67]:
7
In [68]:
layer_names = []
for layer in model_03.layers:
    layer_names.append(layer.name)

layer_names
Out[68]:
['conv2d',
 'max_pooling2d',
 'conv2d_1',
 'max_pooling2d_1',
 'flatten',
 'dense',
 'dense_1']
In [69]:
# These are the names of the layers, so can have them as part of our plot
layer_names = []
for layer in model_03.layers[:3]:
    layer_names.append(layer.name)

images_per_row = 16

# Now let's display our feature maps
for layer_name, layer_activation in zip(layer_names, activations_cnn_03):
    # This is the number of features in the feature map
    n_features = layer_activation.shape[-1]

    # The feature map has shape (1, size, size, n_features)
    size = layer_activation.shape[1]

    # We will tile the activation channels in this matrix
    n_cols = n_features // images_per_row
    display_grid = np.zeros((size * n_cols, images_per_row * size))

    # We'll tile each filter into this big horizontal grid
    for col in range(n_cols):
        for row in range(images_per_row):
            channel_image = layer_activation[0,
                                             :, :,
                                             col * images_per_row + row]
            # Post-process the feature to make it visually palatable
            channel_image -= channel_image.mean()
            channel_image /= channel_image.std()
            channel_image *= 64
            channel_image += 128
            channel_image = np.clip(channel_image, 0, 255).astype('uint8')
            display_grid[col * size : (col + 1) * size,
                         row * size : (row + 1) * size] = channel_image

    # Display the grid
    scale = 1. / size
    plt.figure(figsize=(scale * display_grid.shape[1],
                        scale * display_grid.shape[0]))
    plt.title(layer_name)
    plt.grid(False)
    plt.imshow(display_grid, aspect='auto', cmap='viridis')

plt.show();
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [ ]:
 

Experiment 4¶

  • CNN with 3 layers/max pooling layers
  • 1 fully-connected layer
  • no regularization
Build CNN Model¶
In [70]:
k.clear_session()
model_04 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment4"] = {}
results["Experiment4"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 1 full-connected layer\n • no regularization"
In [71]:
model_04.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 4, 4, 512)         1180160   
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 2, 2, 512)         0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 384)               786816    
                                                                 
 dense_1 (Dense)             (None, 10)                3850      
                                                                 
=================================================================
Total params: 2269578 (8.66 MB)
Trainable params: 2269578 (8.66 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [72]:
keras.utils.plot_model(model_04, "CIFAR10_EXP_04.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [73]:
model_04.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [74]:
# Start time
start_time = time.time()

history_04 = model_04.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_04_3CNN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment4"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 5s 5ms/step - loss: 1.4767 - accuracy: 0.4625 - val_loss: 1.2243 - val_accuracy: 0.5536
Epoch 2/200
704/704 [==============================] - 3s 4ms/step - loss: 1.0333 - accuracy: 0.6361 - val_loss: 0.9642 - val_accuracy: 0.6586
Epoch 3/200
704/704 [==============================] - 3s 4ms/step - loss: 0.8341 - accuracy: 0.7078 - val_loss: 0.9427 - val_accuracy: 0.6744
Epoch 4/200
704/704 [==============================] - 3s 4ms/step - loss: 0.6947 - accuracy: 0.7584 - val_loss: 0.8242 - val_accuracy: 0.7150
Epoch 5/200
704/704 [==============================] - 3s 4ms/step - loss: 0.5756 - accuracy: 0.7968 - val_loss: 0.8089 - val_accuracy: 0.7252
Epoch 6/200
704/704 [==============================] - 3s 4ms/step - loss: 0.4726 - accuracy: 0.8346 - val_loss: 0.8448 - val_accuracy: 0.7164
Epoch 7/200
704/704 [==============================] - 3s 4ms/step - loss: 0.3756 - accuracy: 0.8683 - val_loss: 0.8963 - val_accuracy: 0.7264
Epoch 8/200
704/704 [==============================] - 3s 4ms/step - loss: 0.2996 - accuracy: 0.8940 - val_loss: 0.9447 - val_accuracy: 0.7218
Epoch 9/200
704/704 [==============================] - 3s 4ms/step - loss: 0.2338 - accuracy: 0.9171 - val_loss: 0.9933 - val_accuracy: 0.7356
Epoch 10/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1809 - accuracy: 0.9363 - val_loss: 1.1054 - val_accuracy: 0.7296
Epoch 11/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1465 - accuracy: 0.9491 - val_loss: 1.2160 - val_accuracy: 0.7322
Epoch 12/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1395 - accuracy: 0.9502 - val_loss: 1.2887 - val_accuracy: 0.7386
Epoch 13/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1124 - accuracy: 0.9593 - val_loss: 1.3953 - val_accuracy: 0.7294
Epoch 14/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1113 - accuracy: 0.9627 - val_loss: 1.3615 - val_accuracy: 0.7360
Epoch 15/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0964 - accuracy: 0.9661 - val_loss: 1.4959 - val_accuracy: 0.7182
Epoch 16/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0878 - accuracy: 0.9696 - val_loss: 1.6343 - val_accuracy: 0.7300
Epoch 17/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0906 - accuracy: 0.9686 - val_loss: 1.6486 - val_accuracy: 0.7308
Epoch 18/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0759 - accuracy: 0.9735 - val_loss: 1.7013 - val_accuracy: 0.7388
Epoch 19/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0865 - accuracy: 0.9709 - val_loss: 1.8354 - val_accuracy: 0.7174
Epoch 20/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0712 - accuracy: 0.9756 - val_loss: 1.7544 - val_accuracy: 0.7198
Epoch 21/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0773 - accuracy: 0.9738 - val_loss: 1.7912 - val_accuracy: 0.7376
Epoch 22/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0713 - accuracy: 0.9757 - val_loss: 1.9533 - val_accuracy: 0.7162
Epoch 23/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0631 - accuracy: 0.9785 - val_loss: 2.0287 - val_accuracy: 0.7156
Epoch 24/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0663 - accuracy: 0.9779 - val_loss: 1.8894 - val_accuracy: 0.7240
Epoch 25/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0607 - accuracy: 0.9790 - val_loss: 1.9877 - val_accuracy: 0.7364
Epoch 26/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0596 - accuracy: 0.9802 - val_loss: 2.0521 - val_accuracy: 0.7298
Epoch 27/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0743 - accuracy: 0.9764 - val_loss: 2.1023 - val_accuracy: 0.7310
Epoch 28/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0585 - accuracy: 0.9808 - val_loss: 1.9562 - val_accuracy: 0.7424
Epoch 29/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0570 - accuracy: 0.9818 - val_loss: 2.0703 - val_accuracy: 0.7324
Epoch 30/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0555 - accuracy: 0.9825 - val_loss: 2.1394 - val_accuracy: 0.7318
Epoch 31/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0687 - accuracy: 0.9777 - val_loss: 2.1082 - val_accuracy: 0.7262
Epoch 32/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0583 - accuracy: 0.9816 - val_loss: 2.0926 - val_accuracy: 0.7282
Epoch 33/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0581 - accuracy: 0.9808 - val_loss: 2.0278 - val_accuracy: 0.7218
Epoch 34/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0513 - accuracy: 0.9838 - val_loss: 2.1546 - val_accuracy: 0.7278
Epoch 35/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0498 - accuracy: 0.9839 - val_loss: 2.2408 - val_accuracy: 0.7178
Epoch 36/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0507 - accuracy: 0.9845 - val_loss: 2.3039 - val_accuracy: 0.7204
Epoch 37/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0573 - accuracy: 0.9814 - val_loss: 2.1244 - val_accuracy: 0.7278
Epoch 38/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0477 - accuracy: 0.9847 - val_loss: 2.3224 - val_accuracy: 0.7228
Time taken to train Model: 119.00 seconds
In [75]:
train_loss = history_04.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_04.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_04.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_04.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_04 = tf.keras.models.load_model("A2_Exp_04_3CNN.h5")
test_loss, test_accuracy = model_04.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment4"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment4"]["Test Loss"] = round(test_loss,3)
results["Experiment4"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment4"]["Train Loss"] = round(train_loss,3)
results["Experiment4"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment4"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.048, Training Accuracy: 0.985
Validation Loss: 2.322, Validation Accuracy: 0.723
Test Loss: 0.794, Test Accuracy: 0.736
In [76]:
pred04 = model_04.predict(x_test_norm)
print('shape of preds: ', pred04.shape)

history_04_dict = history_04.history
history_04_df=pd.DataFrame(history_04_dict)
history_04_df.tail().round(3)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
Out[76]:
loss accuracy val_loss val_accuracy
33 0.051 0.984 2.155 0.728
34 0.050 0.984 2.241 0.718
35 0.051 0.984 2.304 0.720
36 0.057 0.981 2.124 0.728
37 0.048 0.985 2.322 0.723

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [77]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_04.history['accuracy'], history_04.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_04.history['loss'], history_04.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [78]:
pred04_cm=np.argmax(pred04, axis=1)
print_validation_report(y_test, pred04_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.75      0.77      0.76      1000
           1       0.91      0.82      0.86      1000
           2       0.64      0.65      0.65      1000
           3       0.67      0.44      0.53      1000
           4       0.64      0.76      0.69      1000
           5       0.66      0.66      0.66      1000
           6       0.79      0.79      0.79      1000
           7       0.80      0.75      0.77      1000
           8       0.72      0.91      0.80      1000
           9       0.80      0.82      0.81      1000

    accuracy                           0.74     10000
   macro avg       0.74      0.74      0.73     10000
weighted avg       0.74      0.74      0.73     10000

Accuracy Score: 0.736
Root Mean Square Error: 2.1444113411377024
In [79]:
plot_confusion_matrix(y_test,pred04_cm)
No description has been provided for this image
In [80]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred04[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[80]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.03% 0.01% 0.05% 91.71% 0.27% 3.54% 2.66% 0.02% 1.70% 0.01%
1 2.28% 0.14% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 97.58% 0.01%
2 11.29% 3.51% 0.26% 0.50% 0.05% 0.07% 0.05% 0.23% 80.51% 3.55%
3 93.39% 0.63% 0.33% 0.04% 0.15% 0.00% 0.04% 0.01% 5.32% 0.09%
4 0.00% 0.00% 0.41% 0.15% 95.40% 0.01% 4.03% 0.00% 0.00% 0.00%
5 0.01% 0.02% 0.65% 0.51% 0.67% 2.41% 95.29% 0.12% 0.31% 0.02%
6 0.20% 3.41% 0.13% 0.98% 0.00% 0.36% 0.12% 0.04% 0.22% 94.54%
7 0.71% 0.05% 12.72% 5.50% 7.49% 3.36% 69.84% 0.06% 0.11% 0.16%
8 0.20% 0.06% 2.11% 70.86% 8.08% 11.48% 3.65% 3.32% 0.09% 0.15%
9 0.62% 69.43% 0.18% 0.01% 0.00% 0.01% 0.63% 0.00% 0.98% 28.12%
10 13.42% 0.01% 10.04% 5.63% 55.98% 5.28% 0.12% 4.54% 4.79% 0.21%
11 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
12 0.02% 0.12% 14.71% 7.28% 28.53% 43.26% 3.36% 2.56% 0.14% 0.01%
13 0.00% 0.00% 0.00% 0.01% 0.15% 0.34% 0.00% 99.50% 0.00% 0.00%
14 0.00% 0.02% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.98%
15 5.27% 1.11% 0.61% 1.61% 1.57% 0.25% 23.30% 0.00% 66.18% 0.10%
16 0.01% 0.07% 0.34% 37.65% 0.04% 56.56% 0.09% 4.82% 0.10% 0.32%
17 0.78% 0.02% 8.15% 12.73% 23.52% 13.49% 1.40% 38.95% 0.54% 0.43%
18 0.01% 0.02% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.95% 0.01%
19 0.00% 0.60% 0.15% 0.40% 0.35% 0.17% 97.85% 0.45% 0.00% 0.02%
In [81]:
layer_names = []
for layer in model_04.layers:
    layer_names.append(layer.name)

layer_names
Out[81]:
['conv2d',
 'max_pooling2d',
 'conv2d_1',
 'max_pooling2d_1',
 'conv2d_2',
 'max_pooling2d_2',
 'flatten',
 'dense',
 'dense_1']
In [82]:
# Extracts the outputs of the top 11 layers:
layer_outputs_04 = [layer.output for layer in model_04.layers[:13]]
# Creates a model that will return these outputs, given the model input:
activation_model_04 = tf.keras.models.Model(inputs=model_04.input, outputs=layer_outputs_04)

# Get activation values for the last dense layer
# activations_04 = activation_model_04.predict(x_valid_norm[:3250])
activations_04 = activation_model_04.predict(x_valid_norm[:1000])
dense_layer_activations_04 = activations_04[-3]
output_layer_activations_04 = activations_04[-1]
32/32 [==============================] - 0s 1ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [83]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_04 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_04 = tsne_04.fit_transform(dense_layer_activations_04)

# Scaling
tsne_results_04 = (tsne_results_04 - tsne_results_04.min()) / (tsne_results_04.max() - tsne_results_04.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1000 samples in 0.001s...
[t-SNE] Computed neighbors for 1000 samples in 0.057s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1000
[t-SNE] Mean sigma: 1.816506
[t-SNE] KL divergence after 250 iterations with early exaggeration: 64.036034
[t-SNE] KL divergence after 300 iterations: 1.867316
In [84]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_04[:,0],tsne_results_04[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_04[:,0],tsne_results_04[:,1], c=y_valid_split[:1000], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_04):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 5¶

  • DNN with 2 layers (384, 768)
  • Batch Normalization
  • L2 Regularization(0.001)
Build CNN Model¶
In [85]:
k.clear_session()
model_05 = Sequential([
   Flatten(input_shape=x_train_norm.shape[1:]),
   Dense(units=384,activation=tf.nn.relu),
#   Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
   BatchNormalization(),
#   Dropout(0.3),
   Dense(units=768,activation=tf.nn.relu),
#   Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
   BatchNormalization(),
#   Dropout(0.3),
   Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment5"] = {}
results["Experiment5"]["Architecture"] = "• DNN with 2 layers (384, 768)\n • Batch Normalization\n • L2 Regularization(0.001)"
In [86]:
model_05.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 3072)              0         
                                                                 
 dense (Dense)               (None, 384)               1180032   
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 batch_normalization_1 (Bat  (None, 768)               3072      
 chNormalization)                                                
                                                                 
 dense_2 (Dense)             (None, 10)                7690      
                                                                 
=================================================================
Total params: 1488010 (5.68 MB)
Trainable params: 1485706 (5.67 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
In [87]:
keras.utils.plot_model(model_05, "CIFAR10_EXP_05.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [88]:
model_05.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [89]:
# Start time
start_time = time.time()

history_05 = model_05.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_05_2DNN_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment5"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 4s 3ms/step - loss: 1.7934 - accuracy: 0.3757 - val_loss: 1.8845 - val_accuracy: 0.3346
Epoch 2/200
704/704 [==============================] - 2s 3ms/step - loss: 1.6300 - accuracy: 0.4255 - val_loss: 1.9138 - val_accuracy: 0.3520
Epoch 3/200
704/704 [==============================] - 2s 3ms/step - loss: 1.5344 - accuracy: 0.4576 - val_loss: 1.7508 - val_accuracy: 0.3996
Epoch 4/200
704/704 [==============================] - 2s 3ms/step - loss: 1.4796 - accuracy: 0.4782 - val_loss: 1.8204 - val_accuracy: 0.4072
Epoch 5/200
704/704 [==============================] - 2s 3ms/step - loss: 1.4278 - accuracy: 0.4950 - val_loss: 1.7341 - val_accuracy: 0.4230
Epoch 6/200
704/704 [==============================] - 2s 3ms/step - loss: 1.3750 - accuracy: 0.5170 - val_loss: 1.7756 - val_accuracy: 0.4160
Epoch 7/200
704/704 [==============================] - 2s 3ms/step - loss: 1.3320 - accuracy: 0.5323 - val_loss: 1.5763 - val_accuracy: 0.4724
Epoch 8/200
704/704 [==============================] - 2s 3ms/step - loss: 1.2988 - accuracy: 0.5404 - val_loss: 1.7172 - val_accuracy: 0.4260
Epoch 9/200
704/704 [==============================] - 2s 3ms/step - loss: 1.2566 - accuracy: 0.5573 - val_loss: 1.5500 - val_accuracy: 0.4836
Epoch 10/200
704/704 [==============================] - 2s 3ms/step - loss: 1.2227 - accuracy: 0.5688 - val_loss: 1.6693 - val_accuracy: 0.4416
Epoch 11/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1923 - accuracy: 0.5769 - val_loss: 1.6718 - val_accuracy: 0.4568
Epoch 12/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1680 - accuracy: 0.5863 - val_loss: 1.6790 - val_accuracy: 0.4488
Epoch 13/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1472 - accuracy: 0.5940 - val_loss: 1.5340 - val_accuracy: 0.4858
Epoch 14/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1163 - accuracy: 0.6065 - val_loss: 1.8078 - val_accuracy: 0.4478
Epoch 15/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0951 - accuracy: 0.6126 - val_loss: 1.5854 - val_accuracy: 0.4812
Epoch 16/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0763 - accuracy: 0.6208 - val_loss: 1.6121 - val_accuracy: 0.4808
Epoch 17/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0528 - accuracy: 0.6260 - val_loss: 1.6193 - val_accuracy: 0.4784
Epoch 18/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0291 - accuracy: 0.6348 - val_loss: 1.7509 - val_accuracy: 0.4656
Epoch 19/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0133 - accuracy: 0.6390 - val_loss: 1.6757 - val_accuracy: 0.4548
Epoch 20/200
704/704 [==============================] - 2s 3ms/step - loss: 0.9909 - accuracy: 0.6495 - val_loss: 1.6374 - val_accuracy: 0.4772
Time taken to train Model: 41.89 seconds
In [90]:
train_loss = history_05.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_05.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_05.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_05.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_05 = tf.keras.models.load_model("A2_Exp_05_2DNN_BN.h5")
test_loss, test_accuracy = model_05.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment5"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment5"]["Test Loss"] = round(test_loss,3)
results["Experiment5"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment5"]["Train Loss"] = round(train_loss,3)
results["Experiment5"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment5"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.991, Training Accuracy: 0.650
Validation Loss: 1.637, Validation Accuracy: 0.477
Test Loss: 1.482, Test Accuracy: 0.490
In [91]:
pred05 = model_05.predict(x_test_norm)
print('shape of preds: ', pred05.shape)
313/313 [==============================] - 0s 938us/step
shape of preds:  (10000, 10)
In [92]:
history_05_dict = history_05.history
history_05_dict.keys()

history_05_df=pd.DataFrame(history_05_dict)
history_05_df.tail().round(3)
Out[92]:
loss accuracy val_loss val_accuracy
15 1.076 0.621 1.612 0.481
16 1.053 0.626 1.619 0.478
17 1.029 0.635 1.751 0.466
18 1.013 0.639 1.676 0.455
19 0.991 0.650 1.637 0.477

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [93]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_05.history['accuracy'], history_05.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_05.history['loss'], history_05.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [94]:
pred05_cm=np.argmax(pred05, axis=1)
print_validation_report(y_test, pred05_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.67      0.31      0.43      1000
           1       0.69      0.46      0.55      1000
           2       0.35      0.45      0.40      1000
           3       0.42      0.13      0.20      1000
           4       0.43      0.44      0.43      1000
           5       0.41      0.49      0.44      1000
           6       0.52      0.60      0.56      1000
           7       0.54      0.60      0.56      1000
           8       0.59      0.69      0.64      1000
           9       0.44      0.73      0.55      1000

    accuracy                           0.49     10000
   macro avg       0.51      0.49      0.48     10000
weighted avg       0.51      0.49      0.48     10000

Accuracy Score: 0.4898
Root Mean Square Error: 3.205604467179318
In [95]:
plot_confusion_matrix(y_test,pred05_cm)
No description has been provided for this image
In [96]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred05[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[96]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.38% 9.75% 4.75% 43.39% 5.04% 32.23% 0.92% 0.00% 2.09% 1.46%
1 0.10% 1.47% 0.00% 0.00% 0.01% 0.00% 0.00% 0.04% 73.42% 24.95%
2 16.85% 7.01% 4.47% 0.04% 0.32% 0.02% 0.05% 8.82% 37.07% 25.34%
3 4.95% 0.49% 7.25% 1.18% 14.89% 1.15% 0.02% 46.92% 22.48% 0.66%
4 0.02% 0.03% 3.28% 0.61% 40.27% 6.05% 49.31% 0.09% 0.31% 0.02%
5 1.02% 0.76% 3.89% 12.78% 10.28% 13.96% 47.91% 1.69% 0.06% 7.63%
6 0.32% 96.28% 0.04% 1.93% 0.00% 1.21% 0.06% 0.04% 0.03% 0.09%
7 0.85% 0.95% 6.63% 0.41% 0.87% 0.12% 86.49% 0.01% 0.02% 3.65%
8 1.01% 0.03% 23.63% 1.62% 59.05% 12.16% 1.01% 0.96% 0.44% 0.07%
9 0.12% 72.50% 0.07% 0.08% 0.24% 0.08% 0.02% 0.06% 1.13% 25.71%
10 13.08% 0.16% 10.60% 0.36% 1.27% 2.20% 0.57% 0.03% 71.68% 0.04%
11 0.01% 4.14% 0.00% 0.12% 0.04% 0.02% 0.04% 0.11% 0.14% 95.37%
12 0.12% 2.57% 18.14% 8.48% 4.32% 42.06% 19.82% 3.36% 0.49% 0.63%
13 19.86% 2.73% 0.35% 0.06% 0.15% 0.47% 0.23% 73.28% 2.19% 0.68%
14 0.02% 17.32% 2.60% 0.92% 0.01% 2.00% 0.24% 0.23% 0.02% 76.65%
15 6.00% 0.04% 1.93% 6.35% 11.71% 30.81% 6.95% 2.44% 33.72% 0.04%
16 6.53% 10.46% 11.06% 20.27% 0.06% 25.95% 0.22% 17.38% 0.08% 8.00%
17 7.17% 0.38% 11.43% 8.87% 9.74% 11.63% 3.44% 25.84% 0.94% 20.58%
18 0.08% 0.30% 0.01% 0.03% 0.16% 0.01% 0.00% 0.10% 99.01% 0.30%
19 0.33% 2.80% 9.26% 5.52% 0.37% 31.39% 23.20% 24.09% 0.06% 3.00%
In [97]:
layer_names = []
for layer in model_05.layers:
    layer_names.append(layer.name)

layer_names
Out[97]:
['flatten',
 'dense',
 'batch_normalization',
 'dense_1',
 'batch_normalization_1',
 'dense_2']
In [98]:
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_05.layers[:6]]
# Creates a model that will return these outputs, given the model input:
activation_model_05 = tf.keras.models.Model(inputs=model_05.input, outputs=layer_outputs)

# Get activation values for the last dense layer
# activations_05 = activation_model_05.predict(x_valid_norm[:3250])
activations_05 = activation_model_05.predict(x_valid_norm[:1500])
dense_layer_activations_05 = activations_05[-3]
output_layer_activations_05 = activations_05[-1]
47/47 [==============================] - 0s 943us/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [99]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_05 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_05 = tsne_05.fit_transform(dense_layer_activations_05)

# Scaling
tsne_results_05 = (tsne_results_05 - tsne_results_05.min()) / (tsne_results_05.max() - tsne_results_05.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1500 samples in 0.001s...
[t-SNE] Computed neighbors for 1500 samples in 0.033s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1500
[t-SNE] Computed conditional probabilities for sample 1500 / 1500
[t-SNE] Mean sigma: 4.815121
[t-SNE] KL divergence after 250 iterations with early exaggeration: 65.247437
[t-SNE] KL divergence after 300 iterations: 1.694011
In [100]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
scatter = plt.scatter(tsne_results_05[:,0],tsne_results_05[:,1], c=y_valid_split[:1500], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_05):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image

Experiment 6¶

  • DNN with 3 layers
  • Regularization: batch normalization
Build CNN Model¶
In [101]:
k.clear_session()
model_06 = Sequential([
  Flatten(input_shape=x_train_norm.shape[1:]),
  Dense(units=384,activation=tf.nn.relu),
  BatchNormalization(),
  Dense(units=768,activation=tf.nn.relu),
  BatchNormalization(),
  Dense(units=1536,activation=tf.nn.relu),
  BatchNormalization(),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment6"] = {}
results["Experiment6"]["Architecture"] = "• DNN with 3 layers\n • Regularization: batch normalization"
In [102]:
model_06.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 3072)              0         
                                                                 
 dense (Dense)               (None, 384)               1180032   
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 batch_normalization_1 (Bat  (None, 768)               3072      
 chNormalization)                                                
                                                                 
 dense_2 (Dense)             (None, 1536)              1181184   
                                                                 
 batch_normalization_2 (Bat  (None, 1536)              6144      
 chNormalization)                                                
                                                                 
 dense_3 (Dense)             (None, 10)                15370     
                                                                 
=================================================================
Total params: 2683018 (10.23 MB)
Trainable params: 2677642 (10.21 MB)
Non-trainable params: 5376 (21.00 KB)
_________________________________________________________________
In [103]:
keras.utils.plot_model(model_06, "CIFAR10_EXP_06.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [104]:
model_06.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [105]:
# Start time
start_time = time.time()

history_06 = model_06.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_06_3DNN_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=7),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment6"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 4s 4ms/step - loss: 1.8833 - accuracy: 0.3563 - val_loss: 1.9355 - val_accuracy: 0.3538
Epoch 2/200
704/704 [==============================] - 2s 3ms/step - loss: 1.6770 - accuracy: 0.4168 - val_loss: 2.0260 - val_accuracy: 0.3426
Epoch 3/200
704/704 [==============================] - 2s 3ms/step - loss: 1.5890 - accuracy: 0.4464 - val_loss: 2.0122 - val_accuracy: 0.3434
Epoch 4/200
704/704 [==============================] - 2s 3ms/step - loss: 1.5105 - accuracy: 0.4751 - val_loss: 2.0802 - val_accuracy: 0.3500
Epoch 5/200
704/704 [==============================] - 2s 3ms/step - loss: 1.4396 - accuracy: 0.4984 - val_loss: 1.7454 - val_accuracy: 0.3938
Epoch 6/200
704/704 [==============================] - 2s 3ms/step - loss: 1.3771 - accuracy: 0.5208 - val_loss: 1.5798 - val_accuracy: 0.4652
Epoch 7/200
704/704 [==============================] - 2s 3ms/step - loss: 1.3153 - accuracy: 0.5418 - val_loss: 1.6387 - val_accuracy: 0.4338
Epoch 8/200
704/704 [==============================] - 2s 3ms/step - loss: 1.2693 - accuracy: 0.5574 - val_loss: 1.5263 - val_accuracy: 0.4718
Epoch 9/200
704/704 [==============================] - 2s 3ms/step - loss: 1.2122 - accuracy: 0.5766 - val_loss: 1.6066 - val_accuracy: 0.4688
Epoch 10/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1561 - accuracy: 0.5907 - val_loss: 1.5726 - val_accuracy: 0.4594
Epoch 11/200
704/704 [==============================] - 2s 3ms/step - loss: 1.1064 - accuracy: 0.6077 - val_loss: 1.6209 - val_accuracy: 0.4794
Epoch 12/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0558 - accuracy: 0.6227 - val_loss: 1.6910 - val_accuracy: 0.4632
Epoch 13/200
704/704 [==============================] - 2s 3ms/step - loss: 1.0112 - accuracy: 0.6379 - val_loss: 1.8966 - val_accuracy: 0.4436
Epoch 14/200
704/704 [==============================] - 2s 3ms/step - loss: 0.9557 - accuracy: 0.6568 - val_loss: 1.6506 - val_accuracy: 0.4906
Epoch 15/200
704/704 [==============================] - 2s 3ms/step - loss: 0.9105 - accuracy: 0.6745 - val_loss: 1.8542 - val_accuracy: 0.4586
Epoch 16/200
704/704 [==============================] - 2s 3ms/step - loss: 0.8585 - accuracy: 0.6913 - val_loss: 1.8315 - val_accuracy: 0.4930
Epoch 17/200
704/704 [==============================] - 2s 3ms/step - loss: 0.8103 - accuracy: 0.7072 - val_loss: 1.8081 - val_accuracy: 0.4740
Epoch 18/200
704/704 [==============================] - 2s 3ms/step - loss: 0.7768 - accuracy: 0.7217 - val_loss: 1.9738 - val_accuracy: 0.4674
Epoch 19/200
704/704 [==============================] - 2s 3ms/step - loss: 0.7420 - accuracy: 0.7332 - val_loss: 37.4505 - val_accuracy: 0.4478
Epoch 20/200
704/704 [==============================] - 2s 3ms/step - loss: 0.6955 - accuracy: 0.7502 - val_loss: 4.8527 - val_accuracy: 0.4738
Epoch 21/200
704/704 [==============================] - 2s 3ms/step - loss: 0.6527 - accuracy: 0.7657 - val_loss: 2.3160 - val_accuracy: 0.4616
Epoch 22/200
704/704 [==============================] - 2s 3ms/step - loss: 0.6247 - accuracy: 0.7728 - val_loss: 2.4464 - val_accuracy: 0.4706
Epoch 23/200
704/704 [==============================] - 2s 3ms/step - loss: 0.5808 - accuracy: 0.7917 - val_loss: 2.4125 - val_accuracy: 0.4782
Time taken to train Model: 56.52 seconds
In [106]:
train_loss = history_06.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_06.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_06.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_06.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_06 = tf.keras.models.load_model("A2_Exp_06_3DNN_BN.h5")
test_loss, test_accuracy = model_06.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment6"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment6"]["Test Loss"] = round(test_loss,3)
results["Experiment6"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment6"]["Train Loss"] = round(train_loss,3)
results["Experiment6"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment6"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.581, Training Accuracy: 0.792
Validation Loss: 2.413, Validation Accuracy: 0.478
Test Loss: 1.480, Test Accuracy: 0.483
In [107]:
pred06 = model_06.predict(x_test_norm)
print('shape of preds: ', pred06.shape)
313/313 [==============================] - 0s 960us/step
shape of preds:  (10000, 10)
In [108]:
history_06_dict = history_06.history
history_06_df=pd.DataFrame(history_06_dict)
history_06_df.tail().round(3)
Out[108]:
loss accuracy val_loss val_accuracy
18 0.742 0.733 37.450 0.448
19 0.695 0.750 4.853 0.474
20 0.653 0.766 2.316 0.462
21 0.625 0.773 2.446 0.471
22 0.581 0.792 2.413 0.478

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [109]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_06.history['accuracy'], history_06.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_06.history['loss'], history_06.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [110]:
pred06_cm=np.argmax(pred06, axis=1)
print_validation_report(y_test, pred06_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.57      0.45      0.50      1000
           1       0.58      0.63      0.61      1000
           2       0.32      0.51      0.40      1000
           3       0.33      0.36      0.35      1000
           4       0.51      0.25      0.33      1000
           5       0.42      0.37      0.40      1000
           6       0.61      0.43      0.50      1000
           7       0.51      0.62      0.56      1000
           8       0.66      0.57      0.62      1000
           9       0.49      0.63      0.55      1000

    accuracy                           0.48     10000
   macro avg       0.50      0.48      0.48     10000
weighted avg       0.50      0.48      0.48     10000

Accuracy Score: 0.4834
Root Mean Square Error: 3.1218103722039237
In [111]:
plot_confusion_matrix(y_test,pred06_cm)
No description has been provided for this image
In [112]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred06[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[112]:
  airplane automobile bird cat deer dog frog horse ship truck
0 7.66% 2.98% 5.13% 50.61% 11.89% 15.55% 0.75% 2.57% 1.59% 1.27%
1 4.73% 25.66% 1.40% 0.43% 0.18% 0.07% 0.08% 0.87% 32.65% 33.93%
2 26.09% 24.59% 1.21% 0.39% 0.34% 0.39% 0.05% 0.27% 34.50% 12.17%
3 10.73% 2.26% 26.29% 3.68% 7.94% 1.76% 0.10% 20.67% 21.73% 4.82%
4 0.36% 0.01% 14.09% 1.07% 51.07% 12.86% 19.84% 0.63% 0.05% 0.01%
5 1.45% 0.14% 5.52% 16.95% 2.42% 6.65% 61.91% 4.68% 0.10% 0.18%
6 1.71% 33.93% 2.43% 45.51% 0.02% 3.89% 0.41% 3.36% 1.74% 7.00%
7 2.94% 0.38% 9.10% 1.52% 2.10% 5.03% 77.75% 0.14% 0.03% 0.99%
8 0.86% 0.27% 47.46% 10.93% 10.09% 14.66% 0.27% 14.41% 0.64% 0.41%
9 1.45% 44.98% 3.11% 1.51% 0.24% 0.13% 0.32% 0.14% 2.90% 45.23%
10 14.39% 1.11% 1.99% 7.04% 1.01% 9.98% 1.56% 0.22% 62.44% 0.25%
11 0.41% 35.34% 0.32% 0.06% 0.05% 0.03% 0.08% 0.12% 0.83% 62.75%
12 1.08% 17.70% 7.69% 20.38% 0.68% 21.92% 20.05% 2.99% 3.30% 4.20%
13 0.07% 0.00% 0.03% 0.00% 0.00% 0.01% 0.00% 99.88% 0.00% 0.00%
14 2.22% 45.84% 9.35% 3.64% 0.08% 3.49% 5.13% 1.09% 0.42% 28.75%
15 3.57% 0.42% 2.37% 11.64% 1.63% 25.82% 4.18% 0.47% 49.52% 0.40%
16 8.17% 10.47% 3.62% 18.23% 1.53% 20.88% 1.96% 24.62% 3.22% 7.30%
17 2.42% 0.74% 6.93% 13.67% 7.07% 7.33% 1.31% 40.73% 0.91% 18.89%
18 6.50% 4.99% 1.18% 0.56% 1.84% 0.01% 0.29% 0.15% 84.21% 0.27%
19 1.53% 0.01% 2.63% 0.35% 4.74% 3.61% 5.87% 81.13% 0.00% 0.12%
In [113]:
layer_names = []
for layer in model_06.layers:
    layer_names.append(layer.name)

layer_names
Out[113]:
['flatten',
 'dense',
 'batch_normalization',
 'dense_1',
 'batch_normalization_1',
 'dense_2',
 'batch_normalization_2',
 'dense_3']
In [114]:
# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model_06.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_06 = tf.keras.models.Model(inputs=model_06.input, outputs=layer_outputs)

# Get activation values for the last dense layer
# activations_06 = activation_model_06.predict(x_valid_norm[:3250])
activations_06 = activation_model_06.predict(x_valid_norm[:1500])
dense_layer_activations_06 = activations_06[-3]
output_layer_activations_06 = activations_06[-1]
47/47 [==============================] - 0s 1ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [115]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_06 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_06 = tsne_06.fit_transform(dense_layer_activations_06)

# Scaling
tsne_results_06 = (tsne_results_06 - tsne_results_06.min()) / (tsne_results_06.max() - tsne_results_06.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1500 samples in 0.001s...
[t-SNE] Computed neighbors for 1500 samples in 0.071s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1500
[t-SNE] Computed conditional probabilities for sample 1500 / 1500
[t-SNE] Mean sigma: 15.065521
[t-SNE] KL divergence after 250 iterations with early exaggeration: 66.312225
[t-SNE] KL divergence after 300 iterations: 1.695715
In [116]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_06[:,0],tsne_results_06[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_06[:,0],tsne_results_06[:,1], c=y_valid_split[:1500], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_06):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 7¶

  • CNN with 2 layers/max pooling layers
  • L2 Regularization(0.001)
Build CNN Model¶
In [117]:
k.clear_session()
model_07 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
#  Dropout(0.3),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
#  Dropout(0.3),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu),
#  Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
  BatchNormalization(),
#  Dropout(0.3),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment7"] = {}
results["Experiment7"]["Architecture"] = "• CNN with 2 layers/max pooling layers\n • L2 Regularization(0.001)"
In [118]:
model_07.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 9216)              0         
                                                                 
 dense (Dense)               (None, 384)               3539328   
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dense_1 (Dense)             (None, 10)                3850      
                                                                 
=================================================================
Total params: 3843466 (14.66 MB)
Trainable params: 3842698 (14.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
In [119]:
keras.utils.plot_model(model_07, "CIFAR10_EXP_07.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [120]:
model_07.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [121]:
# Start time
start_time = time.time()

history_07 = model_07.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_07_2CNN_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment7"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 4s 5ms/step - loss: 1.2678 - accuracy: 0.5580 - val_loss: 1.4533 - val_accuracy: 0.5008
Epoch 2/200
704/704 [==============================] - 3s 5ms/step - loss: 0.9628 - accuracy: 0.6645 - val_loss: 2.1573 - val_accuracy: 0.4074
Epoch 3/200
704/704 [==============================] - 3s 5ms/step - loss: 0.8284 - accuracy: 0.7144 - val_loss: 1.2960 - val_accuracy: 0.5830
Epoch 4/200
704/704 [==============================] - 3s 5ms/step - loss: 0.7231 - accuracy: 0.7514 - val_loss: 1.0535 - val_accuracy: 0.6364
Epoch 5/200
704/704 [==============================] - 3s 5ms/step - loss: 0.6363 - accuracy: 0.7792 - val_loss: 0.9151 - val_accuracy: 0.6892
Epoch 6/200
704/704 [==============================] - 3s 5ms/step - loss: 0.5506 - accuracy: 0.8075 - val_loss: 0.9850 - val_accuracy: 0.6806
Epoch 7/200
704/704 [==============================] - 3s 4ms/step - loss: 0.4700 - accuracy: 0.8357 - val_loss: 0.9247 - val_accuracy: 0.7110
Epoch 8/200
704/704 [==============================] - 3s 5ms/step - loss: 0.3857 - accuracy: 0.8669 - val_loss: 1.0033 - val_accuracy: 0.6950
Epoch 9/200
704/704 [==============================] - 3s 5ms/step - loss: 0.3066 - accuracy: 0.8937 - val_loss: 1.0783 - val_accuracy: 0.7106
Epoch 10/200
704/704 [==============================] - 3s 4ms/step - loss: 0.2392 - accuracy: 0.9178 - val_loss: 1.1189 - val_accuracy: 0.7036
Epoch 11/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1933 - accuracy: 0.9340 - val_loss: 1.2544 - val_accuracy: 0.6988
Epoch 12/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1544 - accuracy: 0.9478 - val_loss: 1.5356 - val_accuracy: 0.6682
Epoch 13/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1321 - accuracy: 0.9552 - val_loss: 1.3954 - val_accuracy: 0.7046
Epoch 14/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1036 - accuracy: 0.9654 - val_loss: 1.5925 - val_accuracy: 0.6998
Epoch 15/200
704/704 [==============================] - 3s 4ms/step - loss: 0.1136 - accuracy: 0.9605 - val_loss: 1.5677 - val_accuracy: 0.6912
Epoch 16/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0983 - accuracy: 0.9660 - val_loss: 1.4800 - val_accuracy: 0.7128
Epoch 17/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0740 - accuracy: 0.9751 - val_loss: 1.4750 - val_accuracy: 0.7092
Epoch 18/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0783 - accuracy: 0.9730 - val_loss: 1.5853 - val_accuracy: 0.7024
Epoch 19/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0621 - accuracy: 0.9796 - val_loss: 1.6739 - val_accuracy: 0.7036
Epoch 20/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0668 - accuracy: 0.9774 - val_loss: 1.7103 - val_accuracy: 0.7064
Epoch 21/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0628 - accuracy: 0.9784 - val_loss: 1.5978 - val_accuracy: 0.7134
Epoch 22/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0555 - accuracy: 0.9816 - val_loss: 1.7585 - val_accuracy: 0.6932
Epoch 23/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0600 - accuracy: 0.9794 - val_loss: 1.7080 - val_accuracy: 0.7148
Epoch 24/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0451 - accuracy: 0.9847 - val_loss: 1.7430 - val_accuracy: 0.6884
Epoch 25/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0916 - accuracy: 0.9700 - val_loss: 1.6268 - val_accuracy: 0.7106
Epoch 26/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0352 - accuracy: 0.9878 - val_loss: 1.8817 - val_accuracy: 0.7006
Epoch 27/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0521 - accuracy: 0.9823 - val_loss: 1.6954 - val_accuracy: 0.7194
Epoch 28/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0383 - accuracy: 0.9867 - val_loss: 2.0266 - val_accuracy: 0.6938
Epoch 29/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0397 - accuracy: 0.9864 - val_loss: 1.8967 - val_accuracy: 0.7092
Epoch 30/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0616 - accuracy: 0.9798 - val_loss: 1.6888 - val_accuracy: 0.7164
Epoch 31/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0246 - accuracy: 0.9916 - val_loss: 1.7348 - val_accuracy: 0.7182
Epoch 32/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0622 - accuracy: 0.9799 - val_loss: 1.8087 - val_accuracy: 0.7250
Epoch 33/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0197 - accuracy: 0.9932 - val_loss: 2.0883 - val_accuracy: 0.6954
Epoch 34/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0952 - accuracy: 0.9702 - val_loss: 1.6310 - val_accuracy: 0.7270
Epoch 35/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0188 - accuracy: 0.9941 - val_loss: 1.6175 - val_accuracy: 0.7330
Epoch 36/200
704/704 [==============================] - 3s 4ms/step - loss: 0.0202 - accuracy: 0.9938 - val_loss: 1.8467 - val_accuracy: 0.7132
Epoch 37/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0342 - accuracy: 0.9883 - val_loss: 1.8946 - val_accuracy: 0.7202
Epoch 38/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0282 - accuracy: 0.9905 - val_loss: 1.8807 - val_accuracy: 0.7212
Epoch 39/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0392 - accuracy: 0.9868 - val_loss: 1.9510 - val_accuracy: 0.6984
Epoch 40/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0466 - accuracy: 0.9847 - val_loss: 1.8118 - val_accuracy: 0.7240
Epoch 41/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0256 - accuracy: 0.9918 - val_loss: 1.7694 - val_accuracy: 0.7218
Epoch 42/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0151 - accuracy: 0.9952 - val_loss: 2.0283 - val_accuracy: 0.6974
Epoch 43/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0322 - accuracy: 0.9891 - val_loss: 2.1238 - val_accuracy: 0.7052
Epoch 44/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0379 - accuracy: 0.9876 - val_loss: 1.7790 - val_accuracy: 0.7208
Epoch 45/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0219 - accuracy: 0.9925 - val_loss: 1.8939 - val_accuracy: 0.7238
Time taken to train Model: 145.38 seconds
In [122]:
train_loss = history_07.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_07.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_07.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_07.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_07 = tf.keras.models.load_model("A2_Exp_07_2CNN_BN.h5")
test_loss, test_accuracy = model_07.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment7"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment7"]["Test Loss"] = round(test_loss,3)
results["Experiment7"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment7"]["Train Loss"] = round(train_loss,3)
results["Experiment7"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment7"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.022, Training Accuracy: 0.993
Validation Loss: 1.894, Validation Accuracy: 0.724
Test Loss: 0.907, Test Accuracy: 0.701
In [123]:
pred07 = model_07.predict(x_test_norm)
print('shape of preds: ', pred07.shape)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
In [124]:
history_07_dict = history_07.history
history_07_df=pd.DataFrame(history_07_dict)
history_07_df.tail().round(3)
Out[124]:
loss accuracy val_loss val_accuracy
40 0.026 0.992 1.769 0.722
41 0.015 0.995 2.028 0.697
42 0.032 0.989 2.124 0.705
43 0.038 0.988 1.779 0.721
44 0.022 0.993 1.894 0.724

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [125]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_07.history['accuracy'], history_07.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_07.history['loss'], history_07.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [126]:
pred07_cm=np.argmax(pred07, axis=1)
print_validation_report(y_test, pred07_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.83      0.64      0.72      1000
           1       0.86      0.82      0.84      1000
           2       0.61      0.57      0.59      1000
           3       0.42      0.72      0.53      1000
           4       0.74      0.61      0.67      1000
           5       0.63      0.61      0.62      1000
           6       0.70      0.84      0.76      1000
           7       0.88      0.64      0.74      1000
           8       0.86      0.77      0.81      1000
           9       0.82      0.79      0.80      1000

    accuracy                           0.70     10000
   macro avg       0.73      0.70      0.71     10000
weighted avg       0.73      0.70      0.71     10000

Accuracy Score: 0.7008
Root Mean Square Error: 2.1608100332976985
In [127]:
plot_confusion_matrix(y_test,pred07_cm)
No description has been provided for this image
In [128]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred07[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[128]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.00% 0.00% 0.04% 84.82% 0.02% 2.12% 12.58% 0.01% 0.41% 0.00%
1 6.45% 68.79% 0.06% 0.00% 0.00% 0.00% 0.00% 0.00% 24.37% 0.33%
2 6.13% 6.65% 2.07% 31.19% 0.86% 1.43% 0.72% 2.01% 48.12% 0.82%
3 74.27% 1.52% 1.43% 0.17% 0.13% 0.01% 0.72% 0.02% 21.48% 0.25%
4 0.00% 0.00% 3.06% 21.39% 3.61% 0.83% 71.11% 0.00% 0.00% 0.00%
5 0.01% 0.01% 0.47% 3.63% 0.50% 3.97% 91.15% 0.22% 0.01% 0.03%
6 0.32% 94.77% 0.11% 0.14% 0.00% 0.66% 0.50% 0.00% 0.01% 3.47%
7 0.10% 0.00% 2.16% 4.61% 18.78% 0.56% 73.66% 0.10% 0.02% 0.02%
8 0.05% 0.00% 0.23% 96.94% 0.13% 1.70% 0.23% 0.71% 0.00% 0.00%
9 0.28% 93.06% 0.58% 0.11% 0.00% 0.01% 0.02% 0.00% 0.20% 5.71%
10 5.74% 0.01% 3.90% 33.00% 51.30% 3.32% 0.03% 2.57% 0.10% 0.03%
11 0.00% 0.53% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.47%
12 0.03% 0.01% 44.11% 11.27% 3.79% 37.68% 1.62% 1.45% 0.02% 0.01%
13 0.03% 0.00% 0.78% 1.69% 3.83% 1.21% 0.02% 92.43% 0.00% 0.01%
14 0.00% 0.03% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 99.95%
15 0.04% 0.01% 0.26% 1.50% 0.12% 0.02% 66.20% 0.00% 31.85% 0.01%
16 0.02% 0.01% 0.23% 16.94% 0.01% 82.48% 0.03% 0.27% 0.00% 0.02%
17 0.43% 0.02% 6.93% 13.90% 0.79% 11.21% 5.65% 60.80% 0.02% 0.25%
18 0.34% 0.62% 0.00% 0.03% 0.00% 0.00% 0.01% 0.00% 98.12% 0.87%
19 0.00% 0.00% 0.13% 0.20% 0.46% 0.04% 99.17% 0.00% 0.00% 0.00%
In [129]:
layer_names = []
for layer in model_07.layers:
    layer_names.append(layer.name)

layer_names
Out[129]:
['conv2d',
 'max_pooling2d',
 'conv2d_1',
 'max_pooling2d_1',
 'flatten',
 'dense',
 'batch_normalization',
 'dense_1']

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [130]:
# Extracts the outputs of the top 8 layers:
layer_outputs_07 = [layer.output for layer in model_07.layers[:8]]
# Creates a model that will return these outputs, given the model input:
activation_model_07 = tf.keras.models.Model(inputs=model_07.input, outputs=layer_outputs_07)

# Get activation values for the last dense layer
# activations_07 = activation_model_07.predict(x_valid_norm[:3250])
activations_07 = activation_model_07.predict(x_valid_norm[:1200])
dense_layer_activations_07 = activations_07[-3]
output_layer_activations_07 = activations_07[-1]
38/38 [==============================] - 0s 2ms/step
In [131]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_07 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_07 = tsne_07.fit_transform(dense_layer_activations_07)

# Scaling
tsne_results_07 = (tsne_results_07 - tsne_results_07.min()) / (tsne_results_07.max() - tsne_results_07.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1200 samples in 0.000s...
[t-SNE] Computed neighbors for 1200 samples in 0.022s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1200
[t-SNE] Computed conditional probabilities for sample 1200 / 1200
[t-SNE] Mean sigma: 1.826008
[t-SNE] KL divergence after 250 iterations with early exaggeration: 65.998062
[t-SNE] KL divergence after 300 iterations: 1.695120
In [132]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_07[:,0],tsne_results_07[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_07[:,0],tsne_results_07[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_07):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 8¶

  • CNN with 3 layers/max pooling layers
  • L2 Regularization(0.001)
Build CNN Model¶
In [133]:
k.clear_session()
model_08 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
#  Dropout(0.3),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
#  Dropout(0.3),
  Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
#  Dropout(0.3),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu),
#  Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
  BatchNormalization(),
#  Dropout(0.3),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment8"] = {}
results["Experiment8"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • L2 Regularization(0.001)"
In [134]:
model_08.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 4, 4, 512)         1180160   
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 2, 2, 512)         0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 384)               786816    
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dense_1 (Dense)             (None, 10)                3850      
                                                                 
=================================================================
Total params: 2271114 (8.66 MB)
Trainable params: 2270346 (8.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
In [135]:
keras.utils.plot_model(model_08, "CIFAR10_EXP_08.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [136]:
model_08.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [137]:
# Start time
start_time = time.time()

history_08 = model_08.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_08_3CNN_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment8"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
704/704 [==============================] - 5s 5ms/step - loss: 1.2948 - accuracy: 0.5441 - val_loss: 1.4855 - val_accuracy: 0.4982
Epoch 2/200
704/704 [==============================] - 3s 5ms/step - loss: 0.9491 - accuracy: 0.6704 - val_loss: 1.4875 - val_accuracy: 0.5378
Epoch 3/200
704/704 [==============================] - 3s 5ms/step - loss: 0.7888 - accuracy: 0.7269 - val_loss: 1.0235 - val_accuracy: 0.6504
Epoch 4/200
704/704 [==============================] - 3s 5ms/step - loss: 0.6651 - accuracy: 0.7691 - val_loss: 1.9600 - val_accuracy: 0.4948
Epoch 5/200
704/704 [==============================] - 3s 5ms/step - loss: 0.5654 - accuracy: 0.8034 - val_loss: 0.9799 - val_accuracy: 0.6866
Epoch 6/200
704/704 [==============================] - 3s 5ms/step - loss: 0.4630 - accuracy: 0.8384 - val_loss: 1.2166 - val_accuracy: 0.6458
Epoch 7/200
704/704 [==============================] - 3s 5ms/step - loss: 0.3849 - accuracy: 0.8667 - val_loss: 1.0119 - val_accuracy: 0.7018
Epoch 8/200
704/704 [==============================] - 3s 5ms/step - loss: 0.3091 - accuracy: 0.8905 - val_loss: 1.0915 - val_accuracy: 0.7100
Epoch 9/200
704/704 [==============================] - 3s 5ms/step - loss: 0.2489 - accuracy: 0.9128 - val_loss: 1.2036 - val_accuracy: 0.7000
Epoch 10/200
704/704 [==============================] - 3s 5ms/step - loss: 0.2037 - accuracy: 0.9289 - val_loss: 1.2918 - val_accuracy: 0.7012
Epoch 11/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1702 - accuracy: 0.9392 - val_loss: 1.4823 - val_accuracy: 0.6802
Epoch 12/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1521 - accuracy: 0.9467 - val_loss: 1.5638 - val_accuracy: 0.6856
Epoch 13/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1391 - accuracy: 0.9512 - val_loss: 1.2930 - val_accuracy: 0.7238
Epoch 14/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1171 - accuracy: 0.9600 - val_loss: 1.4168 - val_accuracy: 0.7076
Epoch 15/200
704/704 [==============================] - 3s 5ms/step - loss: 0.1124 - accuracy: 0.9609 - val_loss: 1.4324 - val_accuracy: 0.7166
Epoch 16/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0994 - accuracy: 0.9651 - val_loss: 1.5606 - val_accuracy: 0.6976
Epoch 17/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0875 - accuracy: 0.9692 - val_loss: 1.7153 - val_accuracy: 0.7072
Epoch 18/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0945 - accuracy: 0.9666 - val_loss: 1.3756 - val_accuracy: 0.7278
Epoch 19/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0806 - accuracy: 0.9713 - val_loss: 1.6530 - val_accuracy: 0.7210
Epoch 20/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0741 - accuracy: 0.9743 - val_loss: 1.4687 - val_accuracy: 0.7280
Epoch 21/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0622 - accuracy: 0.9786 - val_loss: 1.5286 - val_accuracy: 0.7176
Epoch 22/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0730 - accuracy: 0.9742 - val_loss: 1.9872 - val_accuracy: 0.6860
Epoch 23/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0677 - accuracy: 0.9770 - val_loss: 1.6474 - val_accuracy: 0.7192
Epoch 24/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0637 - accuracy: 0.9772 - val_loss: 1.7021 - val_accuracy: 0.7168
Epoch 25/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0557 - accuracy: 0.9808 - val_loss: 1.7471 - val_accuracy: 0.7108
Epoch 26/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0757 - accuracy: 0.9737 - val_loss: 1.8876 - val_accuracy: 0.7058
Epoch 27/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0381 - accuracy: 0.9872 - val_loss: 1.5911 - val_accuracy: 0.7140
Epoch 28/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0554 - accuracy: 0.9807 - val_loss: 1.7155 - val_accuracy: 0.7186
Epoch 29/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0556 - accuracy: 0.9807 - val_loss: 1.8313 - val_accuracy: 0.7168
Epoch 30/200
704/704 [==============================] - 3s 5ms/step - loss: 0.0469 - accuracy: 0.9839 - val_loss: 2.1171 - val_accuracy: 0.7006
Time taken to train Model: 101.77 seconds
In [138]:
train_loss = history_08.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_08.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_08.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_08.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_08 = tf.keras.models.load_model("A2_Exp_08_3CNN_BN.h5")
test_loss, test_accuracy = model_08.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment8"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment8"]["Test Loss"] = round(test_loss,3)
results["Experiment8"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment8"]["Train Loss"] = round(train_loss,3)
results["Experiment8"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment8"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.047, Training Accuracy: 0.984
Validation Loss: 2.117, Validation Accuracy: 0.701
Test Loss: 0.944, Test Accuracy: 0.699
In [139]:
pred08 = model_08.predict(x_test_norm)
print('shape of preds: ', pred08.shape)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
In [140]:
history_08_dict = history_08.history
history_08_df=pd.DataFrame(history_08_dict)
history_08_df.tail().round(3)
Out[140]:
loss accuracy val_loss val_accuracy
25 0.076 0.974 1.888 0.706
26 0.038 0.987 1.591 0.714
27 0.055 0.981 1.715 0.719
28 0.056 0.981 1.831 0.717
29 0.047 0.984 2.117 0.701

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [141]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_08.history['accuracy'], history_08.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_08.history['loss'], history_08.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [142]:
pred08_cm=np.argmax(pred08, axis=1)
print_validation_report(y_test, pred08_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.71      0.69      0.70      1000
           1       0.95      0.71      0.81      1000
           2       0.65      0.65      0.65      1000
           3       0.46      0.72      0.56      1000
           4       0.75      0.62      0.68      1000
           5       0.70      0.55      0.61      1000
           6       0.86      0.75      0.80      1000
           7       0.93      0.61      0.74      1000
           8       0.56      0.96      0.71      1000
           9       0.85      0.73      0.79      1000

    accuracy                           0.70     10000
   macro avg       0.74      0.70      0.70     10000
weighted avg       0.74      0.70      0.70     10000

Accuracy Score: 0.6988
Root Mean Square Error: 2.30967530185522
In [143]:
plot_confusion_matrix(y_test,pred08_cm)
No description has been provided for this image
In [144]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred08[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[144]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.20% 0.00% 0.03% 63.93% 0.00% 1.36% 0.56% 0.00% 33.86% 0.05%
1 0.01% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.98% 0.00%
2 0.76% 0.18% 0.00% 0.04% 0.00% 0.00% 0.00% 0.00% 98.80% 0.21%
3 76.48% 0.02% 0.48% 0.00% 0.00% 0.00% 0.00% 0.00% 23.01% 0.00%
4 0.00% 0.00% 2.15% 7.70% 23.22% 0.31% 66.58% 0.00% 0.03% 0.00%
5 0.00% 0.00% 0.03% 2.06% 0.00% 0.22% 97.54% 0.00% 0.14% 0.00%
6 0.42% 57.52% 0.42% 16.05% 0.01% 1.73% 0.11% 0.15% 6.66% 16.93%
7 6.59% 0.00% 33.58% 6.53% 21.15% 4.08% 25.78% 0.15% 2.05% 0.09%
8 0.03% 0.00% 0.85% 94.48% 0.91% 3.61% 0.03% 0.07% 0.01% 0.01%
9 0.57% 68.29% 0.15% 0.00% 0.00% 0.00% 0.02% 0.00% 16.99% 13.98%
10 59.07% 0.03% 3.53% 12.66% 13.34% 0.32% 0.00% 0.22% 10.75% 0.07%
11 0.00% 0.03% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.02% 99.94%
12 0.31% 0.02% 0.90% 22.42% 21.36% 52.52% 1.01% 0.62% 0.82% 0.02%
13 0.25% 0.00% 0.06% 1.23% 41.70% 21.65% 0.00% 35.02% 0.08% 0.01%
14 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 99.99%
15 0.61% 0.00% 0.01% 0.09% 0.01% 0.00% 0.39% 0.00% 98.88% 0.00%
16 0.01% 0.02% 0.02% 3.06% 0.02% 95.45% 0.01% 1.36% 0.01% 0.05%
17 5.14% 0.57% 2.95% 64.18% 0.69% 3.42% 0.10% 19.65% 0.23% 3.08%
18 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.99% 0.00%
19 0.02% 0.01% 0.09% 1.68% 0.18% 0.62% 97.39% 0.00% 0.00% 0.00%
In [145]:
layer_names = []
for layer in model_08.layers:
    layer_names.append(layer.name)

layer_names
Out[145]:
['conv2d',
 'max_pooling2d',
 'conv2d_1',
 'max_pooling2d_1',
 'conv2d_2',
 'max_pooling2d_2',
 'flatten',
 'dense',
 'batch_normalization',
 'dense_1']
In [146]:
# Extracts the outputs of the top 11 layers:
layer_outputs_08 = [layer.output for layer in model_08.layers[:10]]
# Creates a model that will return these outputs, given the model input:
activation_model_08 = tf.keras.models.Model(inputs=model_08.input, outputs=layer_outputs_08)

# Get activation values for the last dense layer
# activations_08 = activation_model_08.predict(x_valid_norm[:3250])
activations_08 = activation_model_08.predict(x_valid_norm[:1200])
dense_layer_activations_08 = activations_08[-3]
output_layer_activations_08 = activations_08[-1]
38/38 [==============================] - 0s 1ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [147]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_08 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_08 = tsne_08.fit_transform(dense_layer_activations_08)

# Scaling
tsne_results_08 = (tsne_results_08 - tsne_results_08.min()) / (tsne_results_08.max() - tsne_results_08.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1200 samples in 0.001s...
[t-SNE] Computed neighbors for 1200 samples in 0.021s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1200
[t-SNE] Computed conditional probabilities for sample 1200 / 1200
[t-SNE] Mean sigma: 2.350581
[t-SNE] KL divergence after 250 iterations with early exaggeration: 65.708244
[t-SNE] KL divergence after 300 iterations: 1.568058
In [148]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_08[:,0],tsne_results_08[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_08[:,0],tsne_results_08[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_08):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 9¶

  • CNN with 3 layers/max pooling layers
  • Dropout(0.3)
  • L2 Regularization(0.001)
  • Batch Normalization
Build CNN Model¶
In [149]:
k.clear_session()
model_09 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Flatten(),
#  Dense(units=384,activation=tf.nn.relu),
  Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
  BatchNormalization(),
  Dropout(0.3),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment9"] = {}
results["Experiment9"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • Dropout(0.3)\n • L2 Regularization(0.001)\n • Batch Normalization"
In [150]:
model_09.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 15, 15, 128)       0         
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 6, 6, 256)         0         
                                                                 
 conv2d_2 (Conv2D)           (None, 4, 4, 512)         1180160   
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 2, 2, 512)         0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 2, 2, 512)         0         
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 384)               786816    
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dropout_3 (Dropout)         (None, 384)               0         
                                                                 
 dense_1 (Dense)             (None, 10)                3850      
                                                                 
=================================================================
Total params: 2271114 (8.66 MB)
Trainable params: 2270346 (8.66 MB)
Non-trainable params: 768 (3.00 KB)
_________________________________________________________________
In [151]:
keras.utils.plot_model(model_09, "CIFAR10_EXP_09.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [152]:
model_09.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [153]:
# Start time
start_time = time.time()

history_09 = model_09.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_09_3CNN_DO_L2_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )
# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment9"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:52:59.267326: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 1.9834 - accuracy: 0.4108 - val_loss: 1.6030 - val_accuracy: 0.4954
Epoch 2/200
704/704 [==============================] - 4s 5ms/step - loss: 1.4095 - accuracy: 0.5593 - val_loss: 1.4080 - val_accuracy: 0.5346
Epoch 3/200
704/704 [==============================] - 4s 5ms/step - loss: 1.2251 - accuracy: 0.6155 - val_loss: 1.2170 - val_accuracy: 0.6228
Epoch 4/200
704/704 [==============================] - 4s 5ms/step - loss: 1.1422 - accuracy: 0.6431 - val_loss: 1.1869 - val_accuracy: 0.6390
Epoch 5/200
704/704 [==============================] - 4s 5ms/step - loss: 1.0826 - accuracy: 0.6668 - val_loss: 1.0311 - val_accuracy: 0.6880
Epoch 6/200
704/704 [==============================] - 4s 5ms/step - loss: 1.0340 - accuracy: 0.6892 - val_loss: 1.0256 - val_accuracy: 0.6892
Epoch 7/200
704/704 [==============================] - 4s 5ms/step - loss: 1.0030 - accuracy: 0.7022 - val_loss: 0.9803 - val_accuracy: 0.7124
Epoch 8/200
704/704 [==============================] - 4s 5ms/step - loss: 0.9761 - accuracy: 0.7135 - val_loss: 1.0326 - val_accuracy: 0.6934
Epoch 9/200
704/704 [==============================] - 4s 5ms/step - loss: 0.9484 - accuracy: 0.7235 - val_loss: 0.9733 - val_accuracy: 0.7148
Epoch 10/200
704/704 [==============================] - 4s 5ms/step - loss: 0.9235 - accuracy: 0.7329 - val_loss: 0.9717 - val_accuracy: 0.7072
Epoch 11/200
704/704 [==============================] - 4s 5ms/step - loss: 0.9040 - accuracy: 0.7422 - val_loss: 0.9196 - val_accuracy: 0.7384
Epoch 12/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8831 - accuracy: 0.7473 - val_loss: 0.9342 - val_accuracy: 0.7350
Epoch 13/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8631 - accuracy: 0.7555 - val_loss: 0.8716 - val_accuracy: 0.7480
Epoch 14/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8532 - accuracy: 0.7555 - val_loss: 0.8442 - val_accuracy: 0.7600
Epoch 15/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8392 - accuracy: 0.7645 - val_loss: 0.8650 - val_accuracy: 0.7528
Epoch 16/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8381 - accuracy: 0.7646 - val_loss: 0.8535 - val_accuracy: 0.7598
Epoch 17/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8055 - accuracy: 0.7756 - val_loss: 0.8213 - val_accuracy: 0.7710
Epoch 18/200
704/704 [==============================] - 4s 5ms/step - loss: 0.8042 - accuracy: 0.7758 - val_loss: 0.8003 - val_accuracy: 0.7724
Epoch 19/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7806 - accuracy: 0.7852 - val_loss: 0.8582 - val_accuracy: 0.7572
Epoch 20/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7765 - accuracy: 0.7883 - val_loss: 0.8054 - val_accuracy: 0.7760
Epoch 21/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7622 - accuracy: 0.7917 - val_loss: 0.8034 - val_accuracy: 0.7746
Epoch 22/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7572 - accuracy: 0.7912 - val_loss: 0.7984 - val_accuracy: 0.7810
Epoch 23/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7379 - accuracy: 0.7970 - val_loss: 0.7904 - val_accuracy: 0.7820
Epoch 24/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7296 - accuracy: 0.8012 - val_loss: 0.8079 - val_accuracy: 0.7760
Epoch 25/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7285 - accuracy: 0.8029 - val_loss: 0.7988 - val_accuracy: 0.7784
Epoch 26/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7290 - accuracy: 0.8034 - val_loss: 0.7826 - val_accuracy: 0.7812
Epoch 27/200
704/704 [==============================] - 4s 5ms/step - loss: 0.7103 - accuracy: 0.8110 - val_loss: 0.8074 - val_accuracy: 0.7744
Epoch 28/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6960 - accuracy: 0.8140 - val_loss: 0.7937 - val_accuracy: 0.7830
Epoch 29/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6911 - accuracy: 0.8151 - val_loss: 0.7556 - val_accuracy: 0.7924
Epoch 30/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6853 - accuracy: 0.8169 - val_loss: 0.7653 - val_accuracy: 0.7864
Epoch 31/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6713 - accuracy: 0.8223 - val_loss: 0.7872 - val_accuracy: 0.7944
Epoch 32/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6748 - accuracy: 0.8196 - val_loss: 0.7797 - val_accuracy: 0.7904
Epoch 33/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6632 - accuracy: 0.8238 - val_loss: 0.7383 - val_accuracy: 0.8020
Epoch 34/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6618 - accuracy: 0.8259 - val_loss: 0.7901 - val_accuracy: 0.7892
Epoch 35/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6480 - accuracy: 0.8295 - val_loss: 0.7467 - val_accuracy: 0.7980
Epoch 36/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6414 - accuracy: 0.8299 - val_loss: 0.7515 - val_accuracy: 0.7978
Epoch 37/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6349 - accuracy: 0.8320 - val_loss: 0.7848 - val_accuracy: 0.7846
Epoch 38/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6236 - accuracy: 0.8372 - val_loss: 0.7697 - val_accuracy: 0.7898
Epoch 39/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6296 - accuracy: 0.8344 - val_loss: 0.7341 - val_accuracy: 0.8030
Epoch 40/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6211 - accuracy: 0.8378 - val_loss: 0.7625 - val_accuracy: 0.7950
Epoch 41/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6078 - accuracy: 0.8430 - val_loss: 0.7621 - val_accuracy: 0.7940
Epoch 42/200
704/704 [==============================] - 4s 5ms/step - loss: 0.6068 - accuracy: 0.8412 - val_loss: 0.7502 - val_accuracy: 0.7936
Epoch 43/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5967 - accuracy: 0.8460 - val_loss: 0.7803 - val_accuracy: 0.7874
Epoch 44/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5945 - accuracy: 0.8465 - val_loss: 0.7729 - val_accuracy: 0.7938
Epoch 45/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5928 - accuracy: 0.8474 - val_loss: 0.7382 - val_accuracy: 0.8022
Epoch 46/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5886 - accuracy: 0.8470 - val_loss: 0.7753 - val_accuracy: 0.7936
Epoch 47/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5810 - accuracy: 0.8519 - val_loss: 0.7693 - val_accuracy: 0.7948
Epoch 48/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5785 - accuracy: 0.8509 - val_loss: 0.7534 - val_accuracy: 0.7968
Epoch 49/200
704/704 [==============================] - 4s 5ms/step - loss: 0.5736 - accuracy: 0.8533 - val_loss: 0.7459 - val_accuracy: 0.7954
Time taken to train Model: 185.30 seconds
In [154]:
train_loss = history_09.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_09.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_09.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_09.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_09 = tf.keras.models.load_model("A2_Exp_09_3CNN_DO_L2_BN.h5")
test_loss, test_accuracy = model_09.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment9"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment9"]["Test Loss"] = round(test_loss,3)
results["Experiment9"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment9"]["Train Loss"] = round(train_loss,3)
results["Experiment9"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment9"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.574, Training Accuracy: 0.853
Validation Loss: 0.746, Validation Accuracy: 0.795
Test Loss: 0.743, Test Accuracy: 0.798
In [155]:
pred09 = model_09.predict(x_test_norm)
print('shape of preds: ', pred09.shape)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
In [156]:
history_09_dict = history_09.history
history_09_df=pd.DataFrame(history_09_dict)
history_09_df.tail().round(3)
Out[156]:
loss accuracy val_loss val_accuracy
44 0.593 0.847 0.738 0.802
45 0.589 0.847 0.775 0.794
46 0.581 0.852 0.769 0.795
47 0.578 0.851 0.753 0.797
48 0.574 0.853 0.746 0.795

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [157]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_09.history['accuracy'], history_09.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_09.history['loss'], history_09.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [158]:
pred09_cm=np.argmax(pred09, axis=1)
print_validation_report(y_test, pred09_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.83      0.80      0.81      1000
           1       0.92      0.89      0.90      1000
           2       0.80      0.62      0.70      1000
           3       0.66      0.65      0.65      1000
           4       0.72      0.82      0.77      1000
           5       0.68      0.75      0.71      1000
           6       0.81      0.89      0.85      1000
           7       0.85      0.81      0.83      1000
           8       0.87      0.88      0.88      1000
           9       0.85      0.86      0.86      1000

    accuracy                           0.80     10000
   macro avg       0.80      0.80      0.80     10000
weighted avg       0.80      0.80      0.80     10000

Accuracy Score: 0.7976
Root Mean Square Error: 1.8043835512440252
In [159]:
plot_confusion_matrix(y_test,pred09_cm)
No description has been provided for this image
In [160]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred09[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[160]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.08% 0.01% 0.44% 89.39% 0.68% 8.50% 0.60% 0.28% 0.01% 0.00%
1 0.11% 0.52% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.36% 0.01%
2 9.73% 36.61% 0.33% 1.01% 0.45% 0.37% 0.32% 0.42% 42.69% 8.08%
3 88.47% 2.43% 4.12% 0.09% 1.06% 0.02% 0.30% 0.03% 3.43% 0.05%
4 0.00% 0.00% 0.08% 0.06% 0.64% 0.00% 99.22% 0.00% 0.00% 0.00%
5 0.00% 0.00% 0.03% 0.53% 0.06% 0.27% 99.11% 0.00% 0.01% 0.00%
6 0.01% 94.05% 0.01% 0.08% 0.00% 0.24% 0.15% 0.01% 0.00% 5.45%
7 0.03% 0.00% 1.54% 0.47% 1.01% 0.15% 96.76% 0.02% 0.01% 0.00%
8 0.01% 0.00% 0.09% 98.11% 0.24% 1.48% 0.04% 0.03% 0.00% 0.00%
9 0.29% 65.95% 0.01% 0.02% 0.01% 0.02% 0.35% 0.00% 0.39% 32.96%
10 17.56% 0.12% 2.42% 26.70% 3.42% 23.74% 0.16% 4.38% 21.17% 0.33%
11 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.99%
12 0.07% 0.01% 3.03% 19.53% 12.70% 62.71% 0.30% 1.63% 0.01% 0.01%
13 0.00% 0.00% 0.00% 0.00% 0.05% 0.02% 0.00% 99.93% 0.00% 0.00%
14 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 99.98%
15 3.22% 0.24% 0.30% 0.36% 0.06% 0.02% 12.08% 0.00% 83.71% 0.00%
16 0.02% 0.01% 0.25% 4.02% 0.04% 95.31% 0.12% 0.21% 0.01% 0.01%
17 0.15% 0.01% 2.14% 19.37% 3.09% 15.53% 0.63% 58.78% 0.11% 0.19%
18 0.19% 0.07% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.51% 0.22%
19 0.00% 0.00% 0.01% 0.11% 0.03% 0.01% 99.83% 0.00% 0.00% 0.00%
In [161]:
layer_names = []
for layer in model_09.layers:
    layer_names.append(layer.name)

layer_names
Out[161]:
['conv2d',
 'max_pooling2d',
 'dropout',
 'conv2d_1',
 'max_pooling2d_1',
 'dropout_1',
 'conv2d_2',
 'max_pooling2d_2',
 'dropout_2',
 'flatten',
 'dense',
 'batch_normalization',
 'dropout_3',
 'dense_1']
In [162]:
# Extracts the outputs of the top 11 layers:
layer_outputs_09 = [layer.output for layer in model_09.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_09 = tf.keras.models.Model(inputs=model_09.input, outputs=layer_outputs_09)

# Get activation values for the last dense layer
# activations_09 = activation_model_09.predict(x_valid_norm[:3250])
activations_09 = activation_model_09.predict(x_valid_norm[:1200])
dense_layer_activations_09 = activations_09[-4]
output_layer_activations_09 = activations_09[-1]
38/38 [==============================] - 0s 1ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [163]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_09 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_09 = tsne_09.fit_transform(dense_layer_activations_09)

# Scaling
tsne_results_09 = (tsne_results_09 - tsne_results_09.min()) / (tsne_results_09.max() - tsne_results_09.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1200 samples in 0.000s...
[t-SNE] Computed neighbors for 1200 samples in 0.021s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1200
[t-SNE] Computed conditional probabilities for sample 1200 / 1200
[t-SNE] Mean sigma: 2.847355
[t-SNE] KL divergence after 250 iterations with early exaggeration: 62.816189
[t-SNE] KL divergence after 300 iterations: 1.159818
In [164]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_09[:,0],tsne_results_09[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_09[:,0],tsne_results_09[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_09):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 10¶

  • CNN with 3 layers/max pooling layers
  • 2 Fully-Connected Hidden Layers
  • Dropout(0.3)
  • L2 Regularization(0.001)
  • Batch Normalization
Build CNN Model¶
In [165]:
k.clear_session()
model_10 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(0.3),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
  BatchNormalization(),
  Dropout(0.3),
  Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(0.001)),
  BatchNormalization(),
  Dropout(0.3),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment10"] = {}
results["Experiment10"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 2 Fully-Connected Hidden Layers\n • Dropout(0.3)\n • L2 Regularization(0.001)\n • Batch Normalization"
In [166]:
model_10.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 15, 15, 128)       0         
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 6, 6, 256)         0         
                                                                 
 conv2d_2 (Conv2D)           (None, 4, 4, 512)         1180160   
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 2, 2, 512)         0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 2, 2, 512)         0         
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 384)               786816    
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dropout_3 (Dropout)         (None, 384)               0         
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 batch_normalization_1 (Bat  (None, 768)               3072      
 chNormalization)                                                
                                                                 
 dropout_4 (Dropout)         (None, 768)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                7690      
                                                                 
=================================================================
Total params: 2573706 (9.82 MB)
Trainable params: 2571402 (9.81 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
In [167]:
keras.utils.plot_model(model_10, "CIFAR10_EXP_10.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [168]:
model_10.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [169]:
# Start time
start_time = time.time()

history_10 = model_10.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_10_3CNN_2DNN_DO_L2_BN.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )
# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment10"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 03:56:09.179894: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 2.7633 - accuracy: 0.3147 - val_loss: 2.1523 - val_accuracy: 0.4300
Epoch 2/200
704/704 [==============================] - 4s 6ms/step - loss: 1.8389 - accuracy: 0.5035 - val_loss: 1.7617 - val_accuracy: 0.4958
Epoch 3/200
704/704 [==============================] - 4s 6ms/step - loss: 1.5069 - accuracy: 0.5783 - val_loss: 1.5316 - val_accuracy: 0.5566
Epoch 4/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3299 - accuracy: 0.6196 - val_loss: 1.2661 - val_accuracy: 0.6322
Epoch 5/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2306 - accuracy: 0.6470 - val_loss: 1.2873 - val_accuracy: 0.6304
Epoch 6/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1748 - accuracy: 0.6653 - val_loss: 1.1088 - val_accuracy: 0.6808
Epoch 7/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1309 - accuracy: 0.6852 - val_loss: 1.1560 - val_accuracy: 0.6680
Epoch 8/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0975 - accuracy: 0.6983 - val_loss: 1.0114 - val_accuracy: 0.7256
Epoch 9/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0579 - accuracy: 0.7113 - val_loss: 1.0700 - val_accuracy: 0.7078
Epoch 10/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0330 - accuracy: 0.7187 - val_loss: 1.0668 - val_accuracy: 0.7136
Epoch 11/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0067 - accuracy: 0.7299 - val_loss: 1.0101 - val_accuracy: 0.7212
Epoch 12/200
704/704 [==============================] - 4s 6ms/step - loss: 0.9810 - accuracy: 0.7371 - val_loss: 0.9827 - val_accuracy: 0.7388
Epoch 13/200
704/704 [==============================] - 4s 6ms/step - loss: 0.9591 - accuracy: 0.7451 - val_loss: 1.0030 - val_accuracy: 0.7258
Epoch 14/200
704/704 [==============================] - 4s 6ms/step - loss: 0.9346 - accuracy: 0.7504 - val_loss: 1.0178 - val_accuracy: 0.7300
Epoch 15/200
704/704 [==============================] - 4s 6ms/step - loss: 0.9194 - accuracy: 0.7566 - val_loss: 0.9340 - val_accuracy: 0.7514
Epoch 16/200
704/704 [==============================] - 4s 6ms/step - loss: 0.9052 - accuracy: 0.7650 - val_loss: 0.9278 - val_accuracy: 0.7560
Epoch 17/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8801 - accuracy: 0.7676 - val_loss: 0.8870 - val_accuracy: 0.7628
Epoch 18/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8623 - accuracy: 0.7753 - val_loss: 0.9299 - val_accuracy: 0.7494
Epoch 19/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8460 - accuracy: 0.7793 - val_loss: 0.9020 - val_accuracy: 0.7704
Epoch 20/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8355 - accuracy: 0.7825 - val_loss: 0.8924 - val_accuracy: 0.7660
Epoch 21/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8195 - accuracy: 0.7860 - val_loss: 0.8450 - val_accuracy: 0.7764
Epoch 22/200
704/704 [==============================] - 4s 6ms/step - loss: 0.8051 - accuracy: 0.7947 - val_loss: 0.8561 - val_accuracy: 0.7728
Epoch 23/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7915 - accuracy: 0.7967 - val_loss: 0.8747 - val_accuracy: 0.7710
Epoch 24/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7793 - accuracy: 0.7978 - val_loss: 0.8515 - val_accuracy: 0.7758
Epoch 25/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7696 - accuracy: 0.8022 - val_loss: 0.8211 - val_accuracy: 0.7830
Epoch 26/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7538 - accuracy: 0.8065 - val_loss: 0.8288 - val_accuracy: 0.7812
Epoch 27/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7461 - accuracy: 0.8083 - val_loss: 0.8537 - val_accuracy: 0.7756
Epoch 28/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7250 - accuracy: 0.8139 - val_loss: 0.7816 - val_accuracy: 0.7916
Epoch 29/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7160 - accuracy: 0.8167 - val_loss: 0.8113 - val_accuracy: 0.7898
Epoch 30/200
704/704 [==============================] - 4s 6ms/step - loss: 0.7081 - accuracy: 0.8178 - val_loss: 0.8132 - val_accuracy: 0.7848
Epoch 31/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6997 - accuracy: 0.8208 - val_loss: 0.8011 - val_accuracy: 0.7894
Epoch 32/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6952 - accuracy: 0.8225 - val_loss: 0.8451 - val_accuracy: 0.7762
Epoch 33/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6830 - accuracy: 0.8251 - val_loss: 0.8150 - val_accuracy: 0.7836
Epoch 34/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6735 - accuracy: 0.8310 - val_loss: 0.8077 - val_accuracy: 0.7852
Epoch 35/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6721 - accuracy: 0.8302 - val_loss: 0.7888 - val_accuracy: 0.7964
Epoch 36/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6589 - accuracy: 0.8320 - val_loss: 0.8001 - val_accuracy: 0.7850
Epoch 37/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6566 - accuracy: 0.8324 - val_loss: 0.7991 - val_accuracy: 0.7852
Epoch 38/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6427 - accuracy: 0.8350 - val_loss: 0.8206 - val_accuracy: 0.7774
Epoch 39/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6308 - accuracy: 0.8388 - val_loss: 0.8357 - val_accuracy: 0.7710
Epoch 40/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6283 - accuracy: 0.8413 - val_loss: 0.7936 - val_accuracy: 0.7906
Epoch 41/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6147 - accuracy: 0.8427 - val_loss: 0.8122 - val_accuracy: 0.7826
Epoch 42/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6112 - accuracy: 0.8471 - val_loss: 0.7681 - val_accuracy: 0.7966
Epoch 43/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6044 - accuracy: 0.8492 - val_loss: 0.8230 - val_accuracy: 0.7796
Epoch 44/200
704/704 [==============================] - 4s 6ms/step - loss: 0.6008 - accuracy: 0.8502 - val_loss: 0.7868 - val_accuracy: 0.7926
Epoch 45/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5936 - accuracy: 0.8493 - val_loss: 0.7622 - val_accuracy: 0.7996
Epoch 46/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5947 - accuracy: 0.8535 - val_loss: 0.7753 - val_accuracy: 0.7900
Epoch 47/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5849 - accuracy: 0.8530 - val_loss: 0.7844 - val_accuracy: 0.7940
Epoch 48/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5829 - accuracy: 0.8546 - val_loss: 0.7776 - val_accuracy: 0.7958
Epoch 49/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5693 - accuracy: 0.8597 - val_loss: 0.7883 - val_accuracy: 0.7938
Epoch 50/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5698 - accuracy: 0.8593 - val_loss: 0.7657 - val_accuracy: 0.7982
Epoch 51/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5607 - accuracy: 0.8606 - val_loss: 0.7836 - val_accuracy: 0.7962
Epoch 52/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5676 - accuracy: 0.8578 - val_loss: 0.7648 - val_accuracy: 0.7992
Epoch 53/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5586 - accuracy: 0.8605 - val_loss: 0.7606 - val_accuracy: 0.8012
Epoch 54/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5529 - accuracy: 0.8636 - val_loss: 0.7709 - val_accuracy: 0.7964
Epoch 55/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5404 - accuracy: 0.8662 - val_loss: 0.8297 - val_accuracy: 0.7866
Epoch 56/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5448 - accuracy: 0.8660 - val_loss: 0.7625 - val_accuracy: 0.7984
Epoch 57/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5388 - accuracy: 0.8664 - val_loss: 0.7929 - val_accuracy: 0.7916
Epoch 58/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5343 - accuracy: 0.8688 - val_loss: 0.7625 - val_accuracy: 0.7998
Epoch 59/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5337 - accuracy: 0.8676 - val_loss: 0.7665 - val_accuracy: 0.8004
Epoch 60/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5294 - accuracy: 0.8714 - val_loss: 0.7494 - val_accuracy: 0.8118
Epoch 61/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5212 - accuracy: 0.8726 - val_loss: 0.7581 - val_accuracy: 0.8046
Epoch 62/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5144 - accuracy: 0.8732 - val_loss: 0.7583 - val_accuracy: 0.8060
Epoch 63/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5099 - accuracy: 0.8777 - val_loss: 0.7608 - val_accuracy: 0.8020
Epoch 64/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5093 - accuracy: 0.8747 - val_loss: 0.7710 - val_accuracy: 0.7996
Epoch 65/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5118 - accuracy: 0.8745 - val_loss: 0.7451 - val_accuracy: 0.8062
Epoch 66/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5054 - accuracy: 0.8764 - val_loss: 0.7839 - val_accuracy: 0.7998
Epoch 67/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5009 - accuracy: 0.8775 - val_loss: 0.7768 - val_accuracy: 0.8022
Epoch 68/200
704/704 [==============================] - 4s 6ms/step - loss: 0.5001 - accuracy: 0.8789 - val_loss: 0.7661 - val_accuracy: 0.7988
Epoch 69/200
704/704 [==============================] - 4s 6ms/step - loss: 0.4910 - accuracy: 0.8816 - val_loss: 0.7415 - val_accuracy: 0.8084
Epoch 70/200
704/704 [==============================] - 4s 6ms/step - loss: 0.4900 - accuracy: 0.8805 - val_loss: 0.7718 - val_accuracy: 0.8044
Time taken to train Model: 284.72 seconds
In [170]:
train_loss = history_10.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_10.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_10.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_10.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_10 = tf.keras.models.load_model("A2_Exp_10_3CNN_2DNN_DO_L2_BN.h5")
test_loss, test_accuracy = model_10.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment10"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment10"]["Test Loss"] = round(test_loss,3)
results["Experiment10"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment10"]["Train Loss"] = round(train_loss,3)
results["Experiment10"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment10"]["Validation Loss"] = round(val_loss,3)
Training Loss: 0.490, Training Accuracy: 0.880
Validation Loss: 0.772, Validation Accuracy: 0.804
Test Loss: 0.741, Test Accuracy: 0.811
In [171]:
pred10 = model_10.predict(x_test_norm)
print('shape of preds: ', pred10.shape)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
In [172]:
history_10_dict = history_10.history
history_10_df=pd.DataFrame(history_10_dict)
history_10_df.tail().round(3)
Out[172]:
loss accuracy val_loss val_accuracy
65 0.505 0.876 0.784 0.800
66 0.501 0.877 0.777 0.802
67 0.500 0.879 0.766 0.799
68 0.491 0.882 0.741 0.808
69 0.490 0.880 0.772 0.804

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [173]:
plt.subplots(figsize=(16,12))
plt.tight_layout()
display_training_curves(history_10.history['accuracy'], history_10.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_10.history['loss'], history_10.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [174]:
pred10_cm=np.argmax(pred10, axis=1)
print_validation_report(y_test, pred10_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.83      0.84      0.84      1000
           1       0.92      0.90      0.91      1000
           2       0.80      0.67      0.73      1000
           3       0.68      0.65      0.66      1000
           4       0.72      0.83      0.77      1000
           5       0.76      0.73      0.75      1000
           6       0.83      0.89      0.86      1000
           7       0.85      0.82      0.83      1000
           8       0.91      0.86      0.89      1000
           9       0.83      0.91      0.86      1000

    accuracy                           0.81     10000
   macro avg       0.81      0.81      0.81     10000
weighted avg       0.81      0.81      0.81     10000

Accuracy Score: 0.8111
Root Mean Square Error: 1.7584652399180372
In [175]:
plot_confusion_matrix(y_test,pred10_cm)
No description has been provided for this image
In [176]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred10[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[176]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.02% 0.00% 0.07% 94.90% 0.03% 4.72% 0.17% 0.05% 0.03% 0.00%
1 9.96% 20.72% 0.03% 0.02% 0.03% 0.00% 0.01% 0.00% 68.96% 0.27%
2 0.97% 0.67% 0.06% 0.03% 0.05% 0.02% 0.02% 0.01% 98.08% 0.09%
3 97.95% 0.05% 0.24% 0.19% 0.06% 0.01% 0.01% 0.00% 1.32% 0.16%
4 0.00% 0.00% 0.02% 0.04% 0.24% 0.01% 99.68% 0.00% 0.00% 0.00%
5 0.00% 0.00% 0.01% 0.38% 0.05% 0.04% 99.50% 0.00% 0.01% 0.00%
6 0.01% 99.62% 0.01% 0.03% 0.00% 0.01% 0.00% 0.01% 0.00% 0.31%
7 0.06% 0.01% 0.21% 0.56% 1.07% 0.14% 97.89% 0.01% 0.03% 0.02%
8 0.01% 0.00% 0.07% 98.07% 0.53% 1.05% 0.13% 0.13% 0.01% 0.01%
9 0.06% 52.34% 0.02% 0.04% 0.02% 0.01% 0.05% 0.00% 0.36% 47.12%
10 27.80% 0.01% 1.40% 18.56% 9.80% 21.10% 0.08% 20.49% 0.20% 0.56%
11 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.99%
12 0.01% 0.02% 0.33% 7.20% 11.28% 80.29% 0.54% 0.22% 0.02% 0.08%
13 0.00% 0.00% 0.00% 0.01% 0.00% 0.02% 0.00% 99.97% 0.00% 0.00%
14 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.99%
15 3.50% 0.26% 0.06% 0.76% 2.27% 0.03% 8.43% 0.02% 84.60% 0.07%
16 0.00% 0.00% 0.06% 1.52% 0.01% 98.34% 0.03% 0.02% 0.00% 0.00%
17 0.01% 0.05% 0.37% 9.61% 1.25% 34.08% 0.14% 54.07% 0.02% 0.40%
18 1.81% 0.89% 0.01% 0.02% 0.02% 0.00% 0.01% 0.00% 96.05% 1.19%
19 0.00% 0.00% 0.01% 0.01% 0.00% 0.00% 99.98% 0.00% 0.00% 0.00%
In [177]:
layer_names = []
for layer in model_10.layers:
    layer_names.append(layer.name)

layer_names
Out[177]:
['conv2d',
 'max_pooling2d',
 'dropout',
 'conv2d_1',
 'max_pooling2d_1',
 'dropout_1',
 'conv2d_2',
 'max_pooling2d_2',
 'dropout_2',
 'flatten',
 'dense',
 'batch_normalization',
 'dropout_3',
 'dense_1',
 'batch_normalization_1',
 'dropout_4',
 'dense_2']
In [178]:
# Extracts the outputs of the top 11 layers:
layer_outputs_10 = [layer.output for layer in model_10.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_10 = tf.keras.models.Model(inputs=model_10.input, outputs=layer_outputs_10)

# Get activation values for the last dense layer
# activations_10 = activation_model_10.predict(x_valid_norm[:3250])
activations_10 = activation_model_10.predict(x_valid_norm[:1200])
dense_layer_activations_10 = activations_10[-4]
output_layer_activations_10 = activations_10[-1]
38/38 [==============================] - 0s 1ms/step

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [179]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_10 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_10 = tsne_10.fit_transform(dense_layer_activations_10)

# Scaling
tsne_results_10 = (tsne_results_10 - tsne_results_10.min()) / (tsne_results_10.max() - tsne_results_10.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1200 samples in 0.000s...
[t-SNE] Computed neighbors for 1200 samples in 0.021s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1200
[t-SNE] Computed conditional probabilities for sample 1200 / 1200
[t-SNE] Mean sigma: 6.276072
[t-SNE] KL divergence after 250 iterations with early exaggeration: 61.099758
[t-SNE] KL divergence after 300 iterations: 0.962577
In [180]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_10[:,0],tsne_results_10[:,1], c=y_valid_split[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_10[:,0],tsne_results_10[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_10):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Experiment 11 - TWEAK Hyperparameters¶

  • CNN with 3 layers/max pooling layers
  • 2 Fully-Connected Hidden Layers (384, 768)
  • Dropout(variable)
  • L2 Regularization(variable)
  • Batch Normalization
In [181]:
l2_rate = 0.001
dropout_rate = 0.5
Build CNN Model¶
In [182]:
k.clear_session()
model_11 = Sequential([
  Conv2D(filters=128, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu,input_shape=x_train_norm.shape[1:]),
  MaxPool2D((2, 2),strides=2),
  Dropout(dropout_rate),
  Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(dropout_rate),
  Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), activation=tf.nn.relu),
  MaxPool2D((2, 2),strides=2),
  Dropout(dropout_rate),
  Flatten(),
  Dense(units=384,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(l2_rate)),
  BatchNormalization(),
  Dropout(dropout_rate),
  Dense(units=768,activation=tf.nn.relu,kernel_regularizer=tf.keras.regularizers.L2(l2_rate)),
  BatchNormalization(),
  Dropout(dropout_rate),
  Dense(units=10, activation=tf.nn.softmax)
])
results["Experiment11"] = {}
results["Experiment11"]["Architecture"] = "• CNN with 3 layers/max pooling layers\n • 2 Fully-Connected Hidden Layers\n • Dropout(variable)\n • L2 Regularization(0.001)\n • Batch Normalization"
In [183]:
model_11.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 30, 30, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 15, 15, 128)       0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 15, 15, 128)       0         
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 256)       295168    
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 256)         0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 6, 6, 256)         0         
                                                                 
 conv2d_2 (Conv2D)           (None, 4, 4, 512)         1180160   
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 2, 2, 512)         0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 2, 2, 512)         0         
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 384)               786816    
                                                                 
 batch_normalization (Batch  (None, 384)               1536      
 Normalization)                                                  
                                                                 
 dropout_3 (Dropout)         (None, 384)               0         
                                                                 
 dense_1 (Dense)             (None, 768)               295680    
                                                                 
 batch_normalization_1 (Bat  (None, 768)               3072      
 chNormalization)                                                
                                                                 
 dropout_4 (Dropout)         (None, 768)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                7690      
                                                                 
=================================================================
Total params: 2573706 (9.82 MB)
Trainable params: 2571402 (9.81 MB)
Non-trainable params: 2304 (9.00 KB)
_________________________________________________________________
In [184]:
keras.utils.plot_model(model_11, "CIFAR10_EXP_11_TWEAK.png", show_shapes=True)
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
In [185]:
model_11.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

Model Train¶

In [187]:
# Start time
start_time = time.time()

history_11 = model_11.fit(x_train_norm
                    ,y_train_split
                    ,epochs=200
                    ,batch_size=64
                    ,verbose=1
                    ,validation_data=(x_valid_norm, y_valid_split)
                    ,callbacks=[
                     tf.keras.callbacks.ModelCheckpoint("A2_Exp_11_3CNN_2DNN_BN_TWEAK_L2001_DO05.h5",save_best_only=True,save_weights_only=False)
                     ,tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=10),
                    ]
                   )

# End time
end_time = time.time()

# Calculate and print the time taken
elapsed_time = end_time - start_time
print(f"Time taken to train Model: {elapsed_time:.2f} seconds")
results["Experiment11"]["Train Time"] = str(round(elapsed_time, 2)) + ' seconds'
Epoch 1/200
2024-10-20 04:05:04.676861: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
704/704 [==============================] - 6s 6ms/step - loss: 3.2646 - accuracy: 0.2181 - val_loss: 2.6560 - val_accuracy: 0.2834
Epoch 2/200
704/704 [==============================] - 4s 6ms/step - loss: 2.2898 - accuracy: 0.3897 - val_loss: 2.2724 - val_accuracy: 0.3714
Epoch 3/200
704/704 [==============================] - 4s 6ms/step - loss: 1.9079 - accuracy: 0.4693 - val_loss: 1.8766 - val_accuracy: 0.4700
Epoch 4/200
704/704 [==============================] - 4s 6ms/step - loss: 1.7136 - accuracy: 0.5144 - val_loss: 1.7722 - val_accuracy: 0.4950
Epoch 5/200
704/704 [==============================] - 4s 6ms/step - loss: 1.6062 - accuracy: 0.5435 - val_loss: 1.7477 - val_accuracy: 0.5028
Epoch 6/200
704/704 [==============================] - 4s 6ms/step - loss: 1.5422 - accuracy: 0.5663 - val_loss: 1.5872 - val_accuracy: 0.5442
Epoch 7/200
704/704 [==============================] - 4s 6ms/step - loss: 1.4910 - accuracy: 0.5862 - val_loss: 1.6825 - val_accuracy: 0.5234
Epoch 8/200
704/704 [==============================] - 4s 6ms/step - loss: 1.4598 - accuracy: 0.6013 - val_loss: 1.4008 - val_accuracy: 0.6234
Epoch 9/200
704/704 [==============================] - 4s 6ms/step - loss: 1.4173 - accuracy: 0.6152 - val_loss: 1.3281 - val_accuracy: 0.6388
Epoch 10/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3988 - accuracy: 0.6280 - val_loss: 1.3159 - val_accuracy: 0.6340
Epoch 11/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3685 - accuracy: 0.6347 - val_loss: 1.2403 - val_accuracy: 0.6754
Epoch 12/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3441 - accuracy: 0.6427 - val_loss: 1.3900 - val_accuracy: 0.6382
Epoch 13/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3207 - accuracy: 0.6522 - val_loss: 1.2797 - val_accuracy: 0.6646
Epoch 14/200
704/704 [==============================] - 4s 6ms/step - loss: 1.3113 - accuracy: 0.6552 - val_loss: 1.1753 - val_accuracy: 0.7018
Epoch 15/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2907 - accuracy: 0.6640 - val_loss: 1.2927 - val_accuracy: 0.6528
Epoch 16/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2777 - accuracy: 0.6681 - val_loss: 1.2825 - val_accuracy: 0.6596
Epoch 17/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2562 - accuracy: 0.6750 - val_loss: 1.1284 - val_accuracy: 0.7154
Epoch 18/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2396 - accuracy: 0.6776 - val_loss: 1.0919 - val_accuracy: 0.7296
Epoch 19/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2302 - accuracy: 0.6842 - val_loss: 1.2078 - val_accuracy: 0.6842
Epoch 20/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2179 - accuracy: 0.6877 - val_loss: 1.0980 - val_accuracy: 0.7294
Epoch 21/200
704/704 [==============================] - 4s 6ms/step - loss: 1.2085 - accuracy: 0.6880 - val_loss: 1.0410 - val_accuracy: 0.7468
Epoch 22/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1906 - accuracy: 0.6977 - val_loss: 1.0670 - val_accuracy: 0.7310
Epoch 23/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1903 - accuracy: 0.6974 - val_loss: 1.0773 - val_accuracy: 0.7350
Epoch 24/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1740 - accuracy: 0.7000 - val_loss: 1.0295 - val_accuracy: 0.7496
Epoch 25/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1714 - accuracy: 0.7047 - val_loss: 1.0207 - val_accuracy: 0.7538
Epoch 26/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1622 - accuracy: 0.7055 - val_loss: 1.1109 - val_accuracy: 0.7238
Epoch 27/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1484 - accuracy: 0.7102 - val_loss: 1.1120 - val_accuracy: 0.7192
Epoch 28/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1425 - accuracy: 0.7108 - val_loss: 1.0151 - val_accuracy: 0.7596
Epoch 29/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1246 - accuracy: 0.7175 - val_loss: 1.0509 - val_accuracy: 0.7462
Epoch 30/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1254 - accuracy: 0.7176 - val_loss: 0.9764 - val_accuracy: 0.7674
Epoch 31/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1289 - accuracy: 0.7170 - val_loss: 1.1292 - val_accuracy: 0.7086
Epoch 32/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1192 - accuracy: 0.7207 - val_loss: 1.0041 - val_accuracy: 0.7562
Epoch 33/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1044 - accuracy: 0.7224 - val_loss: 0.9532 - val_accuracy: 0.7770
Epoch 34/200
704/704 [==============================] - 4s 6ms/step - loss: 1.1051 - accuracy: 0.7228 - val_loss: 0.9875 - val_accuracy: 0.7618
Epoch 35/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0855 - accuracy: 0.7286 - val_loss: 0.9614 - val_accuracy: 0.7684
Epoch 36/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0871 - accuracy: 0.7292 - val_loss: 1.0699 - val_accuracy: 0.7352
Epoch 37/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0866 - accuracy: 0.7287 - val_loss: 1.0312 - val_accuracy: 0.7486
Epoch 38/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0655 - accuracy: 0.7365 - val_loss: 0.9811 - val_accuracy: 0.7584
Epoch 39/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0672 - accuracy: 0.7348 - val_loss: 1.0472 - val_accuracy: 0.7358
Epoch 40/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0645 - accuracy: 0.7356 - val_loss: 0.9501 - val_accuracy: 0.7714
Epoch 41/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0560 - accuracy: 0.7378 - val_loss: 0.9597 - val_accuracy: 0.7682
Epoch 42/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0481 - accuracy: 0.7387 - val_loss: 0.9514 - val_accuracy: 0.7746
Epoch 43/200
704/704 [==============================] - 4s 6ms/step - loss: 1.0461 - accuracy: 0.7412 - val_loss: 0.9558 - val_accuracy: 0.7712
Time taken to train Model: 173.08 seconds
In [188]:
train_loss = history_11.history['loss'][-1]  # Training loss from the last epoch
train_accuracy = history_11.history['accuracy'][-1]  # Training accuracy from the last epoch
val_loss = history_11.history['val_loss'][-1]  # Validation loss from the last epoch
val_accuracy = history_11.history['val_accuracy'][-1]  # Validation accuracy from the last epoch

# Print training and validation metrics
print(f"Training Loss: {train_loss:.3f}, Training Accuracy: {train_accuracy:.3f}")
print(f"Validation Loss: {val_loss:.3f}, Validation Accuracy: {val_accuracy:.3f}")

model_11 = tf.keras.models.load_model("A2_Exp_11_3CNN_2DNN_BN_TWEAK_L2001_DO05.h5")
test_loss, test_accuracy = model_11.evaluate(x_test_norm, y_test, verbose=0)
print(f"Test Loss: {test_loss:.3f}, Test Accuracy: {test_accuracy:.3f}")

results["Experiment11"]["Test Accuracy"] = round(test_accuracy,3)
results["Experiment11"]["Test Loss"] = round(test_loss,3)
results["Experiment11"]["Train Accuracy"] = round(train_accuracy,3)
results["Experiment11"]["Train Loss"] = round(train_loss,3)
results["Experiment11"]["Validation Accuracy"] = round(val_accuracy,3)
results["Experiment11"]["Validation Loss"] = round(val_loss,3)
Training Loss: 1.046, Training Accuracy: 0.741
Validation Loss: 0.956, Validation Accuracy: 0.771
Test Loss: 0.944, Test Accuracy: 0.772
In [189]:
pred11 = model_11.predict(x_test_norm)
print('shape of preds: ', pred11.shape)
313/313 [==============================] - 0s 1ms/step
shape of preds:  (10000, 10)
In [190]:
history_11_dict = history_11.history
history_11_df=pd.DataFrame(history_11_dict)
history_11_df.tail().round(3)
Out[190]:
loss accuracy val_loss val_accuracy
38 1.067 0.735 1.047 0.736
39 1.064 0.736 0.950 0.771
40 1.056 0.738 0.960 0.768
41 1.048 0.739 0.951 0.775
42 1.046 0.741 0.956 0.771

Plotting Performance Metrics¶

We use Matplotlib to create 2 plots--displaying the training and validation loss (resp. accuracy) for each (training) epoch side by side.

In [191]:
plt.subplots(figsize=(16,12)) # l2_rate = 0.001, dropout_rate = 0.5
plt.tight_layout()
display_training_curves(history_11.history['accuracy'], history_11.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history_11.history['loss'], history_11.history['val_loss'], 'loss', 212)
No description has been provided for this image

Confusion matrices¶

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

In [192]:
pred11_cm=np.argmax(pred11, axis=1)
print_validation_report(y_test, pred11_cm)
Classification Report
              precision    recall  f1-score   support

           0       0.85      0.75      0.80      1000
           1       0.91      0.88      0.90      1000
           2       0.75      0.60      0.67      1000
           3       0.70      0.52      0.60      1000
           4       0.63      0.82      0.72      1000
           5       0.76      0.62      0.68      1000
           6       0.68      0.91      0.78      1000
           7       0.87      0.80      0.83      1000
           8       0.77      0.94      0.84      1000
           9       0.87      0.87      0.87      1000

    accuracy                           0.77     10000
   macro avg       0.78      0.77      0.77     10000
weighted avg       0.78      0.77      0.77     10000

Accuracy Score: 0.7717
Root Mean Square Error: 1.9351485731075018
In [193]:
plot_confusion_matrix(y_test,pred11_cm)
No description has been provided for this image
In [194]:
cm = sns.light_palette((260, 75, 60), input="husl", as_cmap=True)
df = pd.DataFrame(pred11[0:20], columns = ['airplane'
                                          ,'automobile'
                                          ,'bird'
                                          ,'cat'
                                          ,'deer'
                                          ,'dog'
                                          ,'frog'
                                          ,'horse'
                                          ,'ship'
                                          ,'truck'])
df.style.format("{:.2%}").background_gradient(cmap=cm)
Out[194]:
  airplane automobile bird cat deer dog frog horse ship truck
0 0.39% 0.03% 1.15% 88.12% 2.07% 3.40% 2.86% 0.47% 1.44% 0.06%
1 0.49% 0.47% 0.02% 0.05% 0.01% 0.00% 0.02% 0.00% 98.89% 0.04%
2 4.66% 2.81% 0.18% 0.44% 0.17% 0.07% 0.12% 0.05% 85.17% 6.32%
3 6.60% 1.05% 0.23% 0.07% 0.04% 0.01% 0.09% 0.01% 91.60% 0.31%
4 0.04% 0.04% 3.17% 0.76% 10.38% 0.12% 85.37% 0.04% 0.02% 0.04%
5 0.02% 0.02% 0.33% 1.07% 0.98% 0.28% 97.24% 0.02% 0.02% 0.03%
6 0.14% 91.04% 0.16% 0.26% 0.03% 0.20% 0.11% 0.03% 0.03% 7.98%
7 0.38% 0.06% 7.77% 3.22% 14.34% 0.80% 72.92% 0.22% 0.16% 0.13%
8 0.17% 0.04% 2.19% 75.90% 6.23% 9.32% 4.51% 1.46% 0.07% 0.11%
9 1.08% 60.89% 0.53% 0.54% 0.39% 0.18% 2.92% 0.07% 1.50% 31.89%
10 53.60% 0.16% 3.00% 3.23% 6.06% 1.61% 0.32% 2.46% 28.16% 1.39%
11 0.01% 0.13% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 99.84%
12 0.34% 0.35% 6.99% 15.79% 43.54% 13.34% 15.78% 2.25% 1.33% 0.30%
13 0.03% 0.01% 0.13% 0.08% 1.01% 0.54% 0.05% 98.14% 0.01% 0.02%
14 0.05% 0.13% 0.02% 0.02% 0.00% 0.01% 0.00% 0.01% 0.04% 99.71%
15 3.83% 0.22% 0.97% 0.46% 0.49% 0.03% 3.25% 0.03% 90.51% 0.21%
16 0.01% 0.02% 0.32% 7.51% 0.35% 90.63% 0.66% 0.35% 0.07% 0.07%
17 1.43% 0.46% 6.55% 14.10% 12.16% 9.01% 29.78% 22.41% 1.17% 2.91%
18 2.62% 0.35% 0.04% 0.05% 0.02% 0.01% 0.02% 0.01% 95.49% 1.39%
19 0.02% 0.02% 0.62% 0.31% 0.55% 0.04% 98.40% 0.01% 0.01% 0.02%
In [195]:
layer_names = []
for layer in model_11.layers:
    layer_names.append(layer.name)

layer_names
Out[195]:
['conv2d',
 'max_pooling2d',
 'dropout',
 'conv2d_1',
 'max_pooling2d_1',
 'dropout_1',
 'conv2d_2',
 'max_pooling2d_2',
 'dropout_2',
 'flatten',
 'dense',
 'batch_normalization',
 'dropout_3',
 'dense_1',
 'batch_normalization_1',
 'dropout_4',
 'dense_2']
In [196]:
# Extracts the outputs of the top 11 layers:
layer_outputs_11 = [layer.output for layer in model_11.layers[:14]]
# Creates a model that will return these outputs, given the model input:
activation_model_11 = tf.keras.models.Model(inputs=model_11.input, outputs=layer_outputs_11)

# Get activation values for the last dense layer
# activations_11 = activation_model_11.predict(x_valid_norm[:3250])
activations_11 = activation_model_11.predict(x_valid_norm[:1200])
dense_layer_activations_11 = activations_11[-4]
output_layer_activations_11 = activations_11[-1]
38/38 [==============================] - 0s 1ms/step
In [197]:
activations_11[-3].shape
Out[197]:
(1200, 384)

sklearn.manifold.TSNE¶

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

In [198]:
# Reduce the dimension using T-SNE to visualize i n a scatterplot
tsne_11 = TSNE(n_components=2, verbose=1, init='pca', learning_rate='auto', perplexity=40, n_iter=300)
tsne_results_11 = tsne_11.fit_transform(dense_layer_activations_11)

# Scaling
tsne_results_11 = (tsne_results_11 - tsne_results_11.min()) / (tsne_results_11.max() - tsne_results_11.min())
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 1200 samples in 0.001s...
[t-SNE] Computed neighbors for 1200 samples in 0.021s...
[t-SNE] Computed conditional probabilities for sample 1000 / 1200
[t-SNE] Computed conditional probabilities for sample 1200 / 1200
[t-SNE] Mean sigma: 7.921385
[t-SNE] KL divergence after 250 iterations with early exaggeration: 61.436398
[t-SNE] KL divergence after 300 iterations: 1.002880
In [201]:
cmap = plt.cm.tab10
plt.figure(figsize=(16,10))
# scatter = plt.scatter(tsne_results_11[:,0],tsne_results_11[:,1], c=y_valid[:3250], s=10, cmap=cmap)
scatter = plt.scatter(tsne_results_11[:,0],tsne_results_11[:,1], c=y_valid_split[:1200], s=10, cmap=cmap)
plt.legend(handles=scatter.legend_elements()[0], labels=class_names)

image_positions = np.array([[1., 1.]])
for index, position in enumerate(tsne_results_11):
    dist = np.sum((position - image_positions) ** 2, axis=1)
    if np.min(dist) > 0.02: # if far enough from other images
        image_positions = np.r_[image_positions, [position]]
        imagebox = mpl.offsetbox.AnnotationBbox(
            mpl.offsetbox.OffsetImage(x_train[index], cmap="binary"),
            position, bboxprops={"lw": 1})
        plt.gca().add_artist(imagebox)
plt.axis("off")
plt.show()
No description has been provided for this image
In [ ]:
 

Result1: Create a table with the accuracy and loss for train/test/validation & process time for ALL the models.

In [204]:
# Convert the dictionary to a DataFrame
resultDf = pd.DataFrame(results).T
# Replace '\n' with '<br>' in the architecture details
resultDf['Architecture'] = resultDf['Architecture'].str.replace('\n', '<br>')
In [205]:
# Display the table with proper HTML rendering for line breaks
display(HTML(resultDf.to_html(escape=False)))
Architecture Train Time Test Accuracy Test Loss Train Accuracy Train Loss Validation Accuracy Validation Loss
Experiment1 • DNN with 2 layers
• no regularization
21.78 seconds 0.474 1.478 0.53 1.311 0.461 1.55
Experiment2 • DNN with 3 layers
• no regularization
36.06 seconds 0.474 1.466 0.762 0.655 0.454 2.307
Experiment3 • CNN with 2 layers/max pooling layers
• 1 full-connected layer
• no regularization
60.82 seconds 0.713 0.865 0.97 0.085 0.685 2.025
Experiment4 • CNN with 3 layers/max pooling layers
• 1 full-connected layer
• no regularization
119.0 seconds 0.736 0.794 0.985 0.048 0.723 2.322
Experiment5 • DNN with 2 layers (384, 768)
• Batch Normalization
• L2 Regularization(0.001)
41.89 seconds 0.49 1.482 0.65 0.991 0.477 1.637
Experiment6 • DNN with 3 layers
• Regularization: batch normalization
56.52 seconds 0.483 1.48 0.792 0.581 0.478 2.413
Experiment7 • CNN with 2 layers/max pooling layers
• L2 Regularization(0.001)
145.38 seconds 0.701 0.907 0.993 0.022 0.724 1.894
Experiment8 • CNN with 3 layers/max pooling layers
• L2 Regularization(0.001)
101.77 seconds 0.699 0.944 0.984 0.047 0.701 2.117
Experiment9 • CNN with 3 layers/max pooling layers
• Dropout(0.3)
• L2 Regularization(0.001)
• Batch Normalization
185.3 seconds 0.798 0.743 0.853 0.574 0.795 0.746
Experiment10 • CNN with 3 layers/max pooling layers
• 2 Fully-Connected Hidden Layers
• Dropout(0.3)
• L2 Regularization(0.001)
• Batch Normalization
284.72 seconds 0.811 0.741 0.88 0.49 0.804 0.772
Experiment11 • CNN with 3 layers/max pooling layers
• 2 Fully-Connected Hidden Layers
• Dropout(variable)
• L2 Regularization(0.001)
• Batch Normalization
173.08 seconds 0.772 0.944 0.741 1.046 0.771 0.956
In [ ]: